Re: [PATCH 01/50] x86/boot/e820: Introduce arch/x86/include/asm/e820/types.h

2017-01-29 Thread Ingo Molnar

* Sam Ravnborg  wrote:

> On Sat, Jan 28, 2017 at 11:11:22PM +0100, Ingo Molnar wrote:
> > 
> > The plan is to keep the old UAPI header in place but the kernel won't
> > use it anymore - and after some time we'll try to remove it. (User-space
> > tools better have local copies of headers anyway, instead of relying
> > on kernel headers.)
> 
> The idea with uapi is the the kernel provides a sane set of headers
> to be used by user space.
> So we avoid random copies that is maintained by random people in random
> ways resulting in random bugs.

Your argument is simplistic which presents a false dichotomy: maintaining a 
copy 
or fully sharing the header are not the only two options available to share the 
information in the headers between the kernel and tooling: for example perf 
uses a 
half-automated method where headers are copied from the kernel, but also 
checked 
automatically against the upstream kernel, and a (non-fatal) warning is emitted 
during the build if the upstream header has changed.

For example today the perf build shows these UAPI header warnings:

 Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
 Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel

... because new bits were added to those two UAPI headers. For example the new 
ARM 
bits were:

triton:~/tip/tools/perf> diff -up ../arch/arm/include/uapi/asm/kvm.h 
../../arch/arm/include/uapi/asm/kvm.h
--- ../arch/arm/include/uapi/asm/kvm.h  2017-01-23 10:10:18.846003002 +0100
+++ ../../arch/arm/include/uapi/asm/kvm.h   2017-01-28 09:35:12.383587930 
+0100
@@ -84,6 +84,15 @@ struct kvm_regs {
 #define KVM_VGIC_V2_DIST_SIZE  0x1000
 #define KVM_VGIC_V2_CPU_SIZE   0x2000
 
+/* Supported VGICv3 address types  */
+#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST   3
+#define KVM_VGIC_ITS_ADDR_TYPE 4
+
+#define KVM_VGIC_V3_DIST_SIZE  SZ_64K
+#define KVM_VGIC_V3_REDIST_SIZE(2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE   (2 * SZ_64K)
+
 #define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_PSCI_0_2  1 /* CPU uses PSCI v0.2 */
 
... so for these changes the perf side copy can be updated safely. Had the 
changes 
been more intricate, the changes can be copied too - while adopting the tooling 
source code as well.

See the tools/perf/check-headers.sh script.

This IMHO is a far more intelligent and far more robust approach than blind 
sharing or detached copies, because it:

 - forces new changes from upstream to be considered and adapted by tooling

 - header (and thus ABI) synchronization is guaranteed (eventually)

 - it does not actually couple the two source code bases in a rigid fashion:

 - the upstream kernel is free to change those headers (at least from perf's 
POV)
   in any sane way, those changes can be adapted.

 - every step is conscious and there's no way to accidentally break tooling via
   header changes - nor does tooling hinder the kernel from progressing its 
source 
   code base.

It's basically a script based COW filesystem with guaranteed propagation and 
guaranteed synchronization.

> The step(s) outlined here can only result in inconsistency and
> cannot benefit neither user space nor the kernel in the long run.

That's simply not true, see above.

> The uapi shall be lean and clean headers, and shall include
> no info whatsoever that is not relevant for user space.

I agree with that characterization, and that will be even more so with my 
changes: 
my series makes uapi/asm/bootparam.h more self-contained, more lean - while 
still 
defining the full ABI.

> But requiring all user space programs (diverse libc variants,
> other programs) to maintain their own copy can only result in
> inconsistencies that is the benefit for no one.

That's simply not true, see above.

Note that my changes try to keep the 'UAPI promise' (in that the old e820.h 
header 
is still around), while still modifying the kernel side.

What _IS_ insane is to somehow construe the UAPI headers as a rigid construct 
that 
forces the kernel source to keep using poorly chosen names like 'struct 
e820entry' 
forever...

Thanks,

Ingo


Re: [PATCH 01/50] x86/boot/e820: Introduce arch/x86/include/asm/e820/types.h

2017-01-29 Thread Ingo Molnar

* Sam Ravnborg  wrote:

> On Sat, Jan 28, 2017 at 11:11:22PM +0100, Ingo Molnar wrote:
> > 
> > The plan is to keep the old UAPI header in place but the kernel won't
> > use it anymore - and after some time we'll try to remove it. (User-space
> > tools better have local copies of headers anyway, instead of relying
> > on kernel headers.)
> 
> The idea with uapi is the the kernel provides a sane set of headers
> to be used by user space.
> So we avoid random copies that is maintained by random people in random
> ways resulting in random bugs.

Your argument is simplistic which presents a false dichotomy: maintaining a 
copy 
or fully sharing the header are not the only two options available to share the 
information in the headers between the kernel and tooling: for example perf 
uses a 
half-automated method where headers are copied from the kernel, but also 
checked 
automatically against the upstream kernel, and a (non-fatal) warning is emitted 
during the build if the upstream header has changed.

For example today the perf build shows these UAPI header warnings:

 Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
 Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel

... because new bits were added to those two UAPI headers. For example the new 
ARM 
bits were:

triton:~/tip/tools/perf> diff -up ../arch/arm/include/uapi/asm/kvm.h 
../../arch/arm/include/uapi/asm/kvm.h
--- ../arch/arm/include/uapi/asm/kvm.h  2017-01-23 10:10:18.846003002 +0100
+++ ../../arch/arm/include/uapi/asm/kvm.h   2017-01-28 09:35:12.383587930 
+0100
@@ -84,6 +84,15 @@ struct kvm_regs {
 #define KVM_VGIC_V2_DIST_SIZE  0x1000
 #define KVM_VGIC_V2_CPU_SIZE   0x2000
 
+/* Supported VGICv3 address types  */
+#define KVM_VGIC_V3_ADDR_TYPE_DIST 2
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST   3
+#define KVM_VGIC_ITS_ADDR_TYPE 4
+
+#define KVM_VGIC_V3_DIST_SIZE  SZ_64K
+#define KVM_VGIC_V3_REDIST_SIZE(2 * SZ_64K)
+#define KVM_VGIC_V3_ITS_SIZE   (2 * SZ_64K)
+
 #define KVM_ARM_VCPU_POWER_OFF 0 /* CPU is started in OFF state */
 #define KVM_ARM_VCPU_PSCI_0_2  1 /* CPU uses PSCI v0.2 */
 
... so for these changes the perf side copy can be updated safely. Had the 
changes 
been more intricate, the changes can be copied too - while adopting the tooling 
source code as well.

See the tools/perf/check-headers.sh script.

This IMHO is a far more intelligent and far more robust approach than blind 
sharing or detached copies, because it:

 - forces new changes from upstream to be considered and adapted by tooling

 - header (and thus ABI) synchronization is guaranteed (eventually)

 - it does not actually couple the two source code bases in a rigid fashion:

 - the upstream kernel is free to change those headers (at least from perf's 
POV)
   in any sane way, those changes can be adapted.

 - every step is conscious and there's no way to accidentally break tooling via
   header changes - nor does tooling hinder the kernel from progressing its 
source 
   code base.

It's basically a script based COW filesystem with guaranteed propagation and 
guaranteed synchronization.

> The step(s) outlined here can only result in inconsistency and
> cannot benefit neither user space nor the kernel in the long run.

That's simply not true, see above.

> The uapi shall be lean and clean headers, and shall include
> no info whatsoever that is not relevant for user space.

I agree with that characterization, and that will be even more so with my 
changes: 
my series makes uapi/asm/bootparam.h more self-contained, more lean - while 
still 
defining the full ABI.

> But requiring all user space programs (diverse libc variants,
> other programs) to maintain their own copy can only result in
> inconsistencies that is the benefit for no one.

That's simply not true, see above.

Note that my changes try to keep the 'UAPI promise' (in that the old e820.h 
header 
is still around), while still modifying the kernel side.

What _IS_ insane is to somehow construe the UAPI headers as a rigid construct 
that 
forces the kernel source to keep using poorly chosen names like 'struct 
e820entry' 
forever...

Thanks,

Ingo


[PATCH 6/6] UDC: Add Synopsys UDC Platform driver

2017-01-29 Thread Raviteja Garimella
This patch adds platform driver support for Synopsys UDC.

A new driver file (snps_udc_plat.c) is created for this purpose
where the platform driver registration is done based on OF
node.

Currently, UDC integrated into Broadcom's iProc SoCs (Northstar2
and Cygnus) work with this driver.

New members are added to the UDC data structure for having platform
device support along with extcon and phy support.

Kconfig and Makefiles are modified to select platform driver for
compilation.

Signed-off-by: Raviteja Garimella 
---
 drivers/usb/gadget/udc/Kconfig |  14 ++
 drivers/usb/gadget/udc/Makefile|   1 +
 drivers/usb/gadget/udc/amd5536udc.h|  14 ++
 drivers/usb/gadget/udc/snps_udc_core.c |  54 --
 drivers/usb/gadget/udc/snps_udc_plat.c | 342 +
 5 files changed, 406 insertions(+), 19 deletions(-)
 create mode 100644 drivers/usb/gadget/udc/snps_udc_plat.c

diff --git a/drivers/usb/gadget/udc/Kconfig b/drivers/usb/gadget/udc/Kconfig
index 9178dd2..ff62339 100644
--- a/drivers/usb/gadget/udc/Kconfig
+++ b/drivers/usb/gadget/udc/Kconfig
@@ -253,6 +253,20 @@ config USB_SNP_CORE
  This IP is different to the High Speed OTG IP that can be enabled
  by selecting USB_DWC2 or USB_DWC3 options.
 
+config USB_SNP_UDC_PLAT
+   tristate "Synopsys USB 2.0 Device controller"
+   select USB_SNP_CORE
+   select USB_GADGET_DUALSPEED
+   depends on (USB_GADGET && OF)
+   default ARCH_BCM_IPROC
+   help
+ This adds Platform Device support for Synopsys Designware core
+ AHB subsystem USB2.0 Device Controller (UDC).
+
+ This driver works with UDCs integrated into Broadcom's Northstar2
+ and Cygnus SoCs.
+
+ If unsure, say N.
 #
 # Controllers available in both integrated and discrete versions
 #
diff --git a/drivers/usb/gadget/udc/Makefile b/drivers/usb/gadget/udc/Makefile
index 4f4fd62..ea9e1c7 100644
--- a/drivers/usb/gadget/udc/Makefile
+++ b/drivers/usb/gadget/udc/Makefile
@@ -37,4 +37,5 @@ obj-$(CONFIG_USB_FOTG210_UDC) += fotg210-udc.o
 obj-$(CONFIG_USB_MV_U3D)   += mv_u3d_core.o
 obj-$(CONFIG_USB_GR_UDC)   += gr_udc.o
 obj-$(CONFIG_USB_GADGET_XILINX)+= udc-xilinx.o
+obj-$(CONFIG_USB_SNP_UDC_PLAT) += snps_udc_plat.o
 obj-$(CONFIG_USB_BDC_UDC)  += bdc/
diff --git a/drivers/usb/gadget/udc/amd5536udc.h 
b/drivers/usb/gadget/udc/amd5536udc.h
index c252457..7884281 100644
--- a/drivers/usb/gadget/udc/amd5536udc.h
+++ b/drivers/usb/gadget/udc/amd5536udc.h
@@ -16,6 +16,7 @@
 /* debug control */
 /* #define UDC_VERBOSE */
 
+#include 
 #include 
 #include 
 
@@ -28,6 +29,9 @@
 #define UDC_HSA0_REV 1
 #define UDC_HSB1_REV 2
 
+/* Broadcom chip rev. */
+#define UDC_BCM_REV 10
+
 /*
  * SETUP usb commands
  * needed, because some SETUP's are handled in hw, but must be passed to
@@ -112,6 +116,7 @@
 #define UDC_DEVCTL_BRLEN_MASK  0x00ff
 #define UDC_DEVCTL_BRLEN_OFS   16
 
+#define UDC_DEVCTL_SRX_FLUSH   14
 #define UDC_DEVCTL_CSR_DONE13
 #define UDC_DEVCTL_DEVNAK  12
 #define UDC_DEVCTL_SD  10
@@ -564,7 +569,15 @@ struct udc {
u16 cur_intf;
u16 cur_alt;
 
+   /* for platform device and extcon support */
struct device   *dev;
+   struct phy  *udc_phy;
+   struct extcon_dev   *edev;
+   struct extcon_specific_cable_nb extcon_nb;
+   struct notifier_block   nb;
+   struct delayed_work drd_work;
+   struct workqueue_struct *drd_wq;
+   u32 conn_type;
 };
 
 #define to_amd5536_udc(g)  (container_of((g), struct udc, gadget))
@@ -580,6 +593,7 @@ int udc_enable_dev_setup_interrupts(struct udc *dev);
 int udc_mask_unused_interrupts(struct udc *dev);
 irqreturn_t udc_irq(int irq, void *pdev);
 void gadget_release(struct device *pdev);
+void empty_req_queue(struct udc_ep *ep);
 void udc_basic_init(struct udc *dev);
 void free_dma_pools(struct udc *dev);
 int init_dma_pools(struct udc *dev);
diff --git a/drivers/usb/gadget/udc/snps_udc_core.c 
b/drivers/usb/gadget/udc/snps_udc_core.c
index 5f95a65..98de074 100644
--- a/drivers/usb/gadget/udc/snps_udc_core.c
+++ b/drivers/usb/gadget/udc/snps_udc_core.c
@@ -41,7 +41,6 @@
 #include "amd5536udc.h"
 
 static void udc_tasklet_disconnect(unsigned long);
-static void empty_req_queue(struct udc_ep *);
 static void udc_setup_endpoints(struct udc *dev);
 static void udc_soft_reset(struct udc *dev);
 static struct udc_request *udc_alloc_bna_dummy(struct udc_ep *ep);
@@ -1248,7 +1247,7 @@ udc_queue(struct usb_ep *usbep, struct usb_request 
*usbreq, gfp_t gfp)
 }
 
 /* Empty request queue of an endpoint; caller holds spinlock */
-static void empty_req_queue(struct udc_ep *ep)
+void 

[PATCH 6/6] UDC: Add Synopsys UDC Platform driver

2017-01-29 Thread Raviteja Garimella
This patch adds platform driver support for Synopsys UDC.

A new driver file (snps_udc_plat.c) is created for this purpose
where the platform driver registration is done based on OF
node.

Currently, UDC integrated into Broadcom's iProc SoCs (Northstar2
and Cygnus) work with this driver.

New members are added to the UDC data structure for having platform
device support along with extcon and phy support.

Kconfig and Makefiles are modified to select platform driver for
compilation.

Signed-off-by: Raviteja Garimella 
---
 drivers/usb/gadget/udc/Kconfig |  14 ++
 drivers/usb/gadget/udc/Makefile|   1 +
 drivers/usb/gadget/udc/amd5536udc.h|  14 ++
 drivers/usb/gadget/udc/snps_udc_core.c |  54 --
 drivers/usb/gadget/udc/snps_udc_plat.c | 342 +
 5 files changed, 406 insertions(+), 19 deletions(-)
 create mode 100644 drivers/usb/gadget/udc/snps_udc_plat.c

diff --git a/drivers/usb/gadget/udc/Kconfig b/drivers/usb/gadget/udc/Kconfig
index 9178dd2..ff62339 100644
--- a/drivers/usb/gadget/udc/Kconfig
+++ b/drivers/usb/gadget/udc/Kconfig
@@ -253,6 +253,20 @@ config USB_SNP_CORE
  This IP is different to the High Speed OTG IP that can be enabled
  by selecting USB_DWC2 or USB_DWC3 options.
 
+config USB_SNP_UDC_PLAT
+   tristate "Synopsys USB 2.0 Device controller"
+   select USB_SNP_CORE
+   select USB_GADGET_DUALSPEED
+   depends on (USB_GADGET && OF)
+   default ARCH_BCM_IPROC
+   help
+ This adds Platform Device support for Synopsys Designware core
+ AHB subsystem USB2.0 Device Controller (UDC).
+
+ This driver works with UDCs integrated into Broadcom's Northstar2
+ and Cygnus SoCs.
+
+ If unsure, say N.
 #
 # Controllers available in both integrated and discrete versions
 #
diff --git a/drivers/usb/gadget/udc/Makefile b/drivers/usb/gadget/udc/Makefile
index 4f4fd62..ea9e1c7 100644
--- a/drivers/usb/gadget/udc/Makefile
+++ b/drivers/usb/gadget/udc/Makefile
@@ -37,4 +37,5 @@ obj-$(CONFIG_USB_FOTG210_UDC) += fotg210-udc.o
 obj-$(CONFIG_USB_MV_U3D)   += mv_u3d_core.o
 obj-$(CONFIG_USB_GR_UDC)   += gr_udc.o
 obj-$(CONFIG_USB_GADGET_XILINX)+= udc-xilinx.o
+obj-$(CONFIG_USB_SNP_UDC_PLAT) += snps_udc_plat.o
 obj-$(CONFIG_USB_BDC_UDC)  += bdc/
diff --git a/drivers/usb/gadget/udc/amd5536udc.h 
b/drivers/usb/gadget/udc/amd5536udc.h
index c252457..7884281 100644
--- a/drivers/usb/gadget/udc/amd5536udc.h
+++ b/drivers/usb/gadget/udc/amd5536udc.h
@@ -16,6 +16,7 @@
 /* debug control */
 /* #define UDC_VERBOSE */
 
+#include 
 #include 
 #include 
 
@@ -28,6 +29,9 @@
 #define UDC_HSA0_REV 1
 #define UDC_HSB1_REV 2
 
+/* Broadcom chip rev. */
+#define UDC_BCM_REV 10
+
 /*
  * SETUP usb commands
  * needed, because some SETUP's are handled in hw, but must be passed to
@@ -112,6 +116,7 @@
 #define UDC_DEVCTL_BRLEN_MASK  0x00ff
 #define UDC_DEVCTL_BRLEN_OFS   16
 
+#define UDC_DEVCTL_SRX_FLUSH   14
 #define UDC_DEVCTL_CSR_DONE13
 #define UDC_DEVCTL_DEVNAK  12
 #define UDC_DEVCTL_SD  10
@@ -564,7 +569,15 @@ struct udc {
u16 cur_intf;
u16 cur_alt;
 
+   /* for platform device and extcon support */
struct device   *dev;
+   struct phy  *udc_phy;
+   struct extcon_dev   *edev;
+   struct extcon_specific_cable_nb extcon_nb;
+   struct notifier_block   nb;
+   struct delayed_work drd_work;
+   struct workqueue_struct *drd_wq;
+   u32 conn_type;
 };
 
 #define to_amd5536_udc(g)  (container_of((g), struct udc, gadget))
@@ -580,6 +593,7 @@ int udc_enable_dev_setup_interrupts(struct udc *dev);
 int udc_mask_unused_interrupts(struct udc *dev);
 irqreturn_t udc_irq(int irq, void *pdev);
 void gadget_release(struct device *pdev);
+void empty_req_queue(struct udc_ep *ep);
 void udc_basic_init(struct udc *dev);
 void free_dma_pools(struct udc *dev);
 int init_dma_pools(struct udc *dev);
diff --git a/drivers/usb/gadget/udc/snps_udc_core.c 
b/drivers/usb/gadget/udc/snps_udc_core.c
index 5f95a65..98de074 100644
--- a/drivers/usb/gadget/udc/snps_udc_core.c
+++ b/drivers/usb/gadget/udc/snps_udc_core.c
@@ -41,7 +41,6 @@
 #include "amd5536udc.h"
 
 static void udc_tasklet_disconnect(unsigned long);
-static void empty_req_queue(struct udc_ep *);
 static void udc_setup_endpoints(struct udc *dev);
 static void udc_soft_reset(struct udc *dev);
 static struct udc_request *udc_alloc_bna_dummy(struct udc_ep *ep);
@@ -1248,7 +1247,7 @@ udc_queue(struct usb_ep *usbep, struct usb_request 
*usbreq, gfp_t gfp)
 }
 
 /* Empty request queue of an endpoint; caller holds spinlock */
-static void empty_req_queue(struct udc_ep *ep)
+void empty_req_queue(struct udc_ep *ep)
 {

[PATCH 5/6] DT bindings documentation for Broadcom IPROC USB Device controller.

2017-01-29 Thread Raviteja Garimella
The device node is used for UDCs integrated into Broadcom's
iProc family of SoCs'. The UDC is based on Synopsys Designware
Cores AHB Subsystem USB Device Controller IP.

Signed-off-by: Raviteja Garimella 
---
 .../bindings/usb/brcm,iproc-snps-udc.txt   | 24 ++
 1 file changed, 24 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt

diff --git a/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt 
b/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt
new file mode 100644
index 000..537dd4d
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt
@@ -0,0 +1,24 @@
+Broadcom IPROC USB Device controller.
+
+The device node is used for UDCs integrated into Broadcom's
+iProc family of SoCs'. The UDC is based on Synopsys Designware
+Cores AHB Subsystem Device Controller.
+
+Required properties:
+ - compatible: should be "brcm,iproc-snps-udc"
+ - reg: Offset and length of UDC register set
+ - interrupts: description of interrupt line
+ - phys: phandle to phy node.
+ - extcon: phandle to the extcon device. This is optional and
+   not required for those that don't require extcon support.
+   Extcon support will be required if the UDC is connected to
+   a Dual Role Device Phy that supports both Host and Device
+   mode based on the external cable.
+
+Example:
+   udc_dwc: usb@664e {
+   compatible = "brcm,iproc-snps-udc";
+   reg = <0x664e 0x2000>;
+   interrupts = ;
+   phys = <_phy>;
+   extcon = <_phy>";
-- 
2.1.0



[PATCH 5/6] DT bindings documentation for Broadcom IPROC USB Device controller.

2017-01-29 Thread Raviteja Garimella
The device node is used for UDCs integrated into Broadcom's
iProc family of SoCs'. The UDC is based on Synopsys Designware
Cores AHB Subsystem USB Device Controller IP.

Signed-off-by: Raviteja Garimella 
---
 .../bindings/usb/brcm,iproc-snps-udc.txt   | 24 ++
 1 file changed, 24 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt

diff --git a/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt 
b/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt
new file mode 100644
index 000..537dd4d
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/brcm,iproc-snps-udc.txt
@@ -0,0 +1,24 @@
+Broadcom IPROC USB Device controller.
+
+The device node is used for UDCs integrated into Broadcom's
+iProc family of SoCs'. The UDC is based on Synopsys Designware
+Cores AHB Subsystem Device Controller.
+
+Required properties:
+ - compatible: should be "brcm,iproc-snps-udc"
+ - reg: Offset and length of UDC register set
+ - interrupts: description of interrupt line
+ - phys: phandle to phy node.
+ - extcon: phandle to the extcon device. This is optional and
+   not required for those that don't require extcon support.
+   Extcon support will be required if the UDC is connected to
+   a Dual Role Device Phy that supports both Host and Device
+   mode based on the external cable.
+
+Example:
+   udc_dwc: usb@664e {
+   compatible = "brcm,iproc-snps-udc";
+   reg = <0x664e 0x2000>;
+   interrupts = ;
+   phys = <_phy>;
+   extcon = <_phy>";
-- 
2.1.0



Re: [PATCH V2 1/2] ACPI: processor_perflib: Simplify code and stop using CPUFREQ_START

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 8:07 AM, Rafael J. Wysocki  wrote:
> On Mon, Jan 30, 2017 at 5:29 AM, Viresh Kumar  wrote:
>> acpi_processor_ppc_notifier() can live without using CPUFREQ_START
>> (which is gonna be removed soon), as it is only used while setting
>> ignore_ppc to 0. This can be done with the help of "ignore_ppc < 0"
>> check alone. The notifier function anyway ignores all events except
>> CPUFREQ_ADJUST and dropping CPUFREQ_START wouldn't harm at all.
>>
>> Once CPUFREQ_START event is removed from the cpufreq core,
>> acpi_processor_ppc_notifier() will get called only for CPUFREQ_NOTIFY or
>> CPUFREQ_ADJUST event. Drop the return statement from the first if block
>> to make sure we don't ignore any such events.
>>
>> Signed-off-by: Viresh Kumar 
>>
>> ---
>> V1->V2:
>> - Improved changelog
>> - Don't move the first if block to a later point, as it becomes useless
>>   then.
>> ---
>>  drivers/acpi/processor_perflib.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_perflib.c 
>> b/drivers/acpi/processor_perflib.c
>> index f0b4a981b8d3..18b72eec3507 100644
>> --- a/drivers/acpi/processor_perflib.c
>> +++ b/drivers/acpi/processor_perflib.c
>> @@ -75,10 +75,8 @@ static int acpi_processor_ppc_notifier(struct 
>> notifier_block *nb,
>> struct acpi_processor *pr;
>> unsigned int ppc = 0;
>>
>> -   if (event == CPUFREQ_START && ignore_ppc <= 0) {
>> +   if (ignore_ppc < 0)
>> ignore_ppc = 0;
>> -   return 0;
>> -   }
>
> Don't we want to return from here if ignore_ppc is 0?

I actually wanted to say "was negative" here, not sure why I said the
above in the end.

Anyway, the patch looks correct now.

Thanks,
Rafael


Re: [PATCH V2 1/2] ACPI: processor_perflib: Simplify code and stop using CPUFREQ_START

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 8:07 AM, Rafael J. Wysocki  wrote:
> On Mon, Jan 30, 2017 at 5:29 AM, Viresh Kumar  wrote:
>> acpi_processor_ppc_notifier() can live without using CPUFREQ_START
>> (which is gonna be removed soon), as it is only used while setting
>> ignore_ppc to 0. This can be done with the help of "ignore_ppc < 0"
>> check alone. The notifier function anyway ignores all events except
>> CPUFREQ_ADJUST and dropping CPUFREQ_START wouldn't harm at all.
>>
>> Once CPUFREQ_START event is removed from the cpufreq core,
>> acpi_processor_ppc_notifier() will get called only for CPUFREQ_NOTIFY or
>> CPUFREQ_ADJUST event. Drop the return statement from the first if block
>> to make sure we don't ignore any such events.
>>
>> Signed-off-by: Viresh Kumar 
>>
>> ---
>> V1->V2:
>> - Improved changelog
>> - Don't move the first if block to a later point, as it becomes useless
>>   then.
>> ---
>>  drivers/acpi/processor_perflib.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_perflib.c 
>> b/drivers/acpi/processor_perflib.c
>> index f0b4a981b8d3..18b72eec3507 100644
>> --- a/drivers/acpi/processor_perflib.c
>> +++ b/drivers/acpi/processor_perflib.c
>> @@ -75,10 +75,8 @@ static int acpi_processor_ppc_notifier(struct 
>> notifier_block *nb,
>> struct acpi_processor *pr;
>> unsigned int ppc = 0;
>>
>> -   if (event == CPUFREQ_START && ignore_ppc <= 0) {
>> +   if (ignore_ppc < 0)
>> ignore_ppc = 0;
>> -   return 0;
>> -   }
>
> Don't we want to return from here if ignore_ppc is 0?

I actually wanted to say "was negative" here, not sure why I said the
above in the end.

Anyway, the patch looks correct now.

Thanks,
Rafael


Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-29 Thread Michal Hocko
On Sun 29-01-17 16:50:03, Trevor Cordes wrote:
> On 2017-01-25 Michal Hocko wrote:
> > On Wed 25-01-17 04:02:46, Trevor Cordes wrote:
> > > OK, I patched & compiled mhocko's git tree from the other day
> > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a
> > > couple of weeks ago shows the newest commit (git log) is
> > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"?  Let me know
> > > if I'm doing something wrong, see below.)  
> > 
> > My fault. I should have noted that you should use since-4.9 branch.
> 
> OK, I have good news.  I compiled your mhocko git tree (properly this
> tim!) using since-4.9 branch (last commit
> ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box survived 3
> 3am's, over 60 hours, and I made sure all the usual oom culprits ran,
> and I ran extras (finds on the whole tree, extra rdiff-backups) to try
> to tax it.  Based on my previous criteria I would say your since-4.9 as
> of the above commit solves my bug, at least over a 3 day test span
> (which it never survives when the bug is present)!
> 
> I tested WITHOUT any cgroup/mem boot options.  I do still have my
> mem=6G limiter on, though (I've never tested with it off, until I solve
> the bug with it on, since I've had it on for many months for other
> reasons).

Good news indeed.

> 
> On 2017-01-27 Michal Hocko wrote:
> > OK, that matches the theory that these OOMs are caused by the
> > incorrect active list aging fixed by b4536f0c829c ("mm, memcg: fix
> > the active list aging for lowmem requests when memcg is enabled")
> 
> b4536f0c829c isn't in the since-4.9 I tested above though?

Yes this is a sha1 from Linus tree. The same commit is in the since-4.9
branch under 0759e73ee689f2066a4d64dd90ec5cc3fed28f86. There are some
more fixes on top of course.

> So
> something else you did must have fixed it (also)?  I don't think I've
> run any tests yet with b4536f0c829c in them?  I think the vanillas I
> was doing a couple of weeks ago were before b4536f0c829c, but I can't
> be sure.
> 
> What do I test next?  Does the since-4.9 stuff get pushed into vanilla
> (4.9 hopefully?) so it can find its way into Fedora's stuck F24
> kernel?

Testing with Valinall rc6 released just yesterday would be a good fit.
There are some more fixes sitting on mmotm on top and maybe we want some of them
in finall 4.10. Anyway all those pending changes should be merged in the
next merge window - aka 4.11

> I want to also note that the RHBZ
> https://bugzilla.redhat.com/show_bug.cgi?id=1401012 is garnering more
> interest as more people start me-too'ing.  The situation is almost
> always the same: large rsync's or similar tree-scan accesses cause oom
> on PAE boxes.

I believe your instructions in comment 20 covers it nicely. If the
problem still persists with the current mmotm tree I would suggest
writing to the mailing list (feel free to CC me) and we will have a
look. Thanks!

> However, I wanted to note that many people there reported
> that cgroup_disable=memory doesn't fix anything for them, whereas that
> always makes the problem go away on my boxes.  Strange.
> 
> Thanks Michal and Mel, I really appreciate it!

-- 
Michal Hocko
SUSE Labs


Re: mm, vmscan: commit makes PAE kernel crash nightly (bisected)

2017-01-29 Thread Michal Hocko
On Sun 29-01-17 16:50:03, Trevor Cordes wrote:
> On 2017-01-25 Michal Hocko wrote:
> > On Wed 25-01-17 04:02:46, Trevor Cordes wrote:
> > > OK, I patched & compiled mhocko's git tree from the other day
> > > 4.9.0+. (To confirm, weird, but mhocko's git tree I'm using from a
> > > couple of weeks ago shows the newest commit (git log) is
> > > 69973b830859bc6529a7a0468ba0d80ee5117826 "Linux 4.9"?  Let me know
> > > if I'm doing something wrong, see below.)  
> > 
> > My fault. I should have noted that you should use since-4.9 branch.
> 
> OK, I have good news.  I compiled your mhocko git tree (properly this
> tim!) using since-4.9 branch (last commit
> ca63ff9b11f958efafd8c8fa60fda14baec6149c Jan 25) and the box survived 3
> 3am's, over 60 hours, and I made sure all the usual oom culprits ran,
> and I ran extras (finds on the whole tree, extra rdiff-backups) to try
> to tax it.  Based on my previous criteria I would say your since-4.9 as
> of the above commit solves my bug, at least over a 3 day test span
> (which it never survives when the bug is present)!
> 
> I tested WITHOUT any cgroup/mem boot options.  I do still have my
> mem=6G limiter on, though (I've never tested with it off, until I solve
> the bug with it on, since I've had it on for many months for other
> reasons).

Good news indeed.

> 
> On 2017-01-27 Michal Hocko wrote:
> > OK, that matches the theory that these OOMs are caused by the
> > incorrect active list aging fixed by b4536f0c829c ("mm, memcg: fix
> > the active list aging for lowmem requests when memcg is enabled")
> 
> b4536f0c829c isn't in the since-4.9 I tested above though?

Yes this is a sha1 from Linus tree. The same commit is in the since-4.9
branch under 0759e73ee689f2066a4d64dd90ec5cc3fed28f86. There are some
more fixes on top of course.

> So
> something else you did must have fixed it (also)?  I don't think I've
> run any tests yet with b4536f0c829c in them?  I think the vanillas I
> was doing a couple of weeks ago were before b4536f0c829c, but I can't
> be sure.
> 
> What do I test next?  Does the since-4.9 stuff get pushed into vanilla
> (4.9 hopefully?) so it can find its way into Fedora's stuck F24
> kernel?

Testing with Valinall rc6 released just yesterday would be a good fit.
There are some more fixes sitting on mmotm on top and maybe we want some of them
in finall 4.10. Anyway all those pending changes should be merged in the
next merge window - aka 4.11

> I want to also note that the RHBZ
> https://bugzilla.redhat.com/show_bug.cgi?id=1401012 is garnering more
> interest as more people start me-too'ing.  The situation is almost
> always the same: large rsync's or similar tree-scan accesses cause oom
> on PAE boxes.

I believe your instructions in comment 20 covers it nicely. If the
problem still persists with the current mmotm tree I would suggest
writing to the mailing list (feel free to CC me) and we will have a
look. Thanks!

> However, I wanted to note that many people there reported
> that cgroup_disable=memory doesn't fix anything for them, whereas that
> always makes the problem go away on my boxes.  Strange.
> 
> Thanks Michal and Mel, I really appreciate it!

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v5 07/11] pwm: imx: Provide atomic PWM support for i.MX PWMv2

2017-01-29 Thread Boris Brezillon
On Sun, 29 Jan 2017 22:54:11 +0100
Lukasz Majewski  wrote:

> This commit provides apply() callback implementation for i.MX's PWMv2.
> 
> Suggested-by: Stefan Agner 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Lukasz Majewski 
> Reviewed-by: Boris Brezillon 
> ---
> Changes for v5:
> - Modify ->apply() function to avoid unbalanced clock enabling/disabling
> - Fix preventing iMX7 from hanging
> 
> Changes for v4:
> - Avoid recalculation of PWM parameters when disabling PWM signal
> - Unconditionally call clk_prepare_enable(imx->clk_per) and
>   clk_disable_unprepare(imx->clk_per)
> 
> Changes for v3:
> - Remove ipg clock enable/disable functions
> 
> Changes for v2:
> - None
> ---
>  drivers/pwm/pwm-imx.c | 68 
> +++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index 60cdc5c..fdaa11b 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -249,6 +249,73 @@ static int imx_pwm_config(struct pwm_chip *chip,
>   return ret;
>  }
>  
> +static int imx_pwm_apply_v2(struct pwm_chip *chip, struct pwm_device *pwm,
> + struct pwm_state *state)
> +{
> + unsigned long period_cycles, duty_cycles, prescale;
> + struct imx_chip *imx = to_imx_chip(chip);
> + struct pwm_state cstate;
> + unsigned long long c;
> + int ret;
> +
> + pwm_get_state(pwm, );
> +
> + if (state->enabled) {
> + c = clk_get_rate(imx->clk_per);
> + c *= state->period;
> +
> + do_div(c, 10);
> + period_cycles = c;
> +
> + prescale = period_cycles / 0x1 + 1;
> +
> + period_cycles /= prescale;
> + c = (unsigned long long)period_cycles * state->duty_cycle;
> + do_div(c, state->period);
> + duty_cycles = c;
> +
> + /*
> +  * according to imx pwm RM, the real period value should be
> +  * PERIOD value in PWMPR plus 2.
> +  */
> + if (period_cycles > 2)
> + period_cycles -= 2;
> + else
> + period_cycles = 0;
> +
> + /*
> +  * Wait for a free FIFO slot if the PWM is already enabled, and
> +  * flush the FIFO if the PWM was disabled and is about to be
> +  * enabled.
> +  */
> + if (cstate.enabled) {
> + imx_pwm_wait_fifo_slot(chip, pwm);
> + } else if (state->enabled) {

Should just be

} else {

since we're already in the 'if (state->enabled)' block (see above).

I see that Thierry already applied the series, so, just for the record,
with this fixed, the whole series is

Reviewed-by: Boris Brezillon 

Thanks,

Boris

> + ret = clk_prepare_enable(imx->clk_per);
> + if (ret)
> + return ret;
> +
> + imx_pwm_sw_reset(chip);
> + }
> +
> + writel(duty_cycles, imx->mmio_base + MX3_PWMSAR);
> + writel(period_cycles, imx->mmio_base + MX3_PWMPR);
> +
> + writel(MX3_PWMCR_PRESCALER(prescale) |
> +MX3_PWMCR_DOZEEN | MX3_PWMCR_WAITEN |
> +MX3_PWMCR_DBGEN | MX3_PWMCR_CLKSRC_IPG_HIGH |
> +MX3_PWMCR_EN,
> +imx->mmio_base + MX3_PWMCR);
> +
> + } else if (cstate.enabled) {
> + writel(0, imx->mmio_base + MX3_PWMCR);
> +
> + clk_disable_unprepare(imx->clk_per);
> + }
> +
> + return 0;
> +}


Re: [PATCH v5 07/11] pwm: imx: Provide atomic PWM support for i.MX PWMv2

2017-01-29 Thread Boris Brezillon
On Sun, 29 Jan 2017 22:54:11 +0100
Lukasz Majewski  wrote:

> This commit provides apply() callback implementation for i.MX's PWMv2.
> 
> Suggested-by: Stefan Agner 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Lukasz Majewski 
> Reviewed-by: Boris Brezillon 
> ---
> Changes for v5:
> - Modify ->apply() function to avoid unbalanced clock enabling/disabling
> - Fix preventing iMX7 from hanging
> 
> Changes for v4:
> - Avoid recalculation of PWM parameters when disabling PWM signal
> - Unconditionally call clk_prepare_enable(imx->clk_per) and
>   clk_disable_unprepare(imx->clk_per)
> 
> Changes for v3:
> - Remove ipg clock enable/disable functions
> 
> Changes for v2:
> - None
> ---
>  drivers/pwm/pwm-imx.c | 68 
> +++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index 60cdc5c..fdaa11b 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -249,6 +249,73 @@ static int imx_pwm_config(struct pwm_chip *chip,
>   return ret;
>  }
>  
> +static int imx_pwm_apply_v2(struct pwm_chip *chip, struct pwm_device *pwm,
> + struct pwm_state *state)
> +{
> + unsigned long period_cycles, duty_cycles, prescale;
> + struct imx_chip *imx = to_imx_chip(chip);
> + struct pwm_state cstate;
> + unsigned long long c;
> + int ret;
> +
> + pwm_get_state(pwm, );
> +
> + if (state->enabled) {
> + c = clk_get_rate(imx->clk_per);
> + c *= state->period;
> +
> + do_div(c, 10);
> + period_cycles = c;
> +
> + prescale = period_cycles / 0x1 + 1;
> +
> + period_cycles /= prescale;
> + c = (unsigned long long)period_cycles * state->duty_cycle;
> + do_div(c, state->period);
> + duty_cycles = c;
> +
> + /*
> +  * according to imx pwm RM, the real period value should be
> +  * PERIOD value in PWMPR plus 2.
> +  */
> + if (period_cycles > 2)
> + period_cycles -= 2;
> + else
> + period_cycles = 0;
> +
> + /*
> +  * Wait for a free FIFO slot if the PWM is already enabled, and
> +  * flush the FIFO if the PWM was disabled and is about to be
> +  * enabled.
> +  */
> + if (cstate.enabled) {
> + imx_pwm_wait_fifo_slot(chip, pwm);
> + } else if (state->enabled) {

Should just be

} else {

since we're already in the 'if (state->enabled)' block (see above).

I see that Thierry already applied the series, so, just for the record,
with this fixed, the whole series is

Reviewed-by: Boris Brezillon 

Thanks,

Boris

> + ret = clk_prepare_enable(imx->clk_per);
> + if (ret)
> + return ret;
> +
> + imx_pwm_sw_reset(chip);
> + }
> +
> + writel(duty_cycles, imx->mmio_base + MX3_PWMSAR);
> + writel(period_cycles, imx->mmio_base + MX3_PWMPR);
> +
> + writel(MX3_PWMCR_PRESCALER(prescale) |
> +MX3_PWMCR_DOZEEN | MX3_PWMCR_WAITEN |
> +MX3_PWMCR_DBGEN | MX3_PWMCR_CLKSRC_IPG_HIGH |
> +MX3_PWMCR_EN,
> +imx->mmio_base + MX3_PWMCR);
> +
> + } else if (cstate.enabled) {
> + writel(0, imx->mmio_base + MX3_PWMCR);
> +
> + clk_disable_unprepare(imx->clk_per);
> + }
> +
> + return 0;
> +}


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Dan Williams
On Sun, Jan 29, 2017 at 11:22 PM, Omar Sandoval  wrote:
> On Mon, Jan 30, 2017 at 08:05:52AM +0100, Hannes Reinecke wrote:
>> On 01/29/2017 05:58 AM, Dan Williams wrote:
>> > Warnings of the following form occur because scsi reuses a devt number
>> > while the block layer still has it referenced as the name of the bdi
>> > [1]:
>> >
>> >  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
>> >  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
>> >  [..]
>> >  Call Trace:
>> >   dump_stack+0x86/0xc3
>> >   __warn+0xcb/0xf0
>> >   warn_slowpath_fmt+0x5f/0x80
>> >   ? kernfs_path_from_node+0x4f/0x60
>> >   sysfs_warn_dup+0x62/0x80
>> >   sysfs_create_dir_ns+0x77/0x90
>> >   kobject_add_internal+0xb2/0x350
>> >   kobject_add+0x75/0xd0
>> >   device_add+0x15a/0x650
>> >   device_create_groups_vargs+0xe0/0xf0
>> >   device_create_vargs+0x1c/0x20
>> >   bdi_register+0x90/0x240
>> >   ? lockdep_init_map+0x57/0x200
>> >   bdi_register_owner+0x36/0x60
>> >   device_add_disk+0x1bb/0x4e0
>> >   ? __pm_runtime_use_autosuspend+0x5c/0x70
>> >   sd_probe_async+0x10d/0x1c0
>> >   async_run_entry_fn+0x39/0x170
>> >
>> > This is a brute-force fix to pass the devt release information from
>> > sd_probe() to the locations where we register the bdi,
>> > device_add_disk(), and unregister the bdi, blk_cleanup_queue().
>> >
>> > Thanks to Omar for the quick reproducer script [2]. This patch survives
>> > where an unmodified kernel fails in a few seconds.
>> >
>> > [1]: https://marc.info/?l=linux-scsi=147116857810716=4
>> > [2]: http://marc.info/?l=linux-block=148554717109098=2
>> >
>> > Cc: James Bottomley 
>> > Cc: Bart Van Assche 
>> > Cc: "Martin K. Petersen" 
>> > Cc: Christoph Hellwig 
>> > Cc: Jens Axboe 
>> > Reported-by: Omar Sandoval 
>> > Signed-off-by: Dan Williams 
>> > ---
>> >  block/blk-core.c   |1 +
>> >  block/genhd.c  |7 +++
>> >  drivers/scsi/sd.c  |   41 +
>> >  include/linux/blkdev.h |1 +
>> >  include/linux/genhd.h  |   17 +
>> >  5 files changed, 59 insertions(+), 8 deletions(-)
>> >
>> Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
>> linux-block), which attempts to solve the same problem.
>
> Hi, Hannes,
>
> It's not the same problem. Jan's series fixes a bdi vs. inode lifetime
> issue, this patch is for a bdi vs devt lifetime issue. Jan's series
> doesn't fix the crashes caused by my reproducer script.

Correct. In fact I was running Jan's patches in my baseline kernel
that fails almost immediately.


Re: [PATCH v2 1/2] nvmem: sunxi-sid: add support for H3 and A64's SID controller

2017-01-29 Thread maxime . ripard
On Sun, Jan 29, 2017 at 09:56:40AM +0800, Icenowy Zheng wrote:
> H3 and A64 SoCs have a bigger SID controller, which has its direct read
> address at 0x200 position in the SID block, not 0x0.
> 
> Also, H3 SID controller has some silicon bug that makes the direct read
> value wrong at first, add code to workaround the bug. (This bug has
> already been fixed on A64 and later SoCs)
> 
> Signed-off-by: Icenowy Zheng 

Please split that into several patches. One to allow to set the size
through the structure, one to support the A64, and one to support the
H3.

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Dan Williams
On Sun, Jan 29, 2017 at 11:22 PM, Omar Sandoval  wrote:
> On Mon, Jan 30, 2017 at 08:05:52AM +0100, Hannes Reinecke wrote:
>> On 01/29/2017 05:58 AM, Dan Williams wrote:
>> > Warnings of the following form occur because scsi reuses a devt number
>> > while the block layer still has it referenced as the name of the bdi
>> > [1]:
>> >
>> >  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
>> >  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
>> >  [..]
>> >  Call Trace:
>> >   dump_stack+0x86/0xc3
>> >   __warn+0xcb/0xf0
>> >   warn_slowpath_fmt+0x5f/0x80
>> >   ? kernfs_path_from_node+0x4f/0x60
>> >   sysfs_warn_dup+0x62/0x80
>> >   sysfs_create_dir_ns+0x77/0x90
>> >   kobject_add_internal+0xb2/0x350
>> >   kobject_add+0x75/0xd0
>> >   device_add+0x15a/0x650
>> >   device_create_groups_vargs+0xe0/0xf0
>> >   device_create_vargs+0x1c/0x20
>> >   bdi_register+0x90/0x240
>> >   ? lockdep_init_map+0x57/0x200
>> >   bdi_register_owner+0x36/0x60
>> >   device_add_disk+0x1bb/0x4e0
>> >   ? __pm_runtime_use_autosuspend+0x5c/0x70
>> >   sd_probe_async+0x10d/0x1c0
>> >   async_run_entry_fn+0x39/0x170
>> >
>> > This is a brute-force fix to pass the devt release information from
>> > sd_probe() to the locations where we register the bdi,
>> > device_add_disk(), and unregister the bdi, blk_cleanup_queue().
>> >
>> > Thanks to Omar for the quick reproducer script [2]. This patch survives
>> > where an unmodified kernel fails in a few seconds.
>> >
>> > [1]: https://marc.info/?l=linux-scsi=147116857810716=4
>> > [2]: http://marc.info/?l=linux-block=148554717109098=2
>> >
>> > Cc: James Bottomley 
>> > Cc: Bart Van Assche 
>> > Cc: "Martin K. Petersen" 
>> > Cc: Christoph Hellwig 
>> > Cc: Jens Axboe 
>> > Reported-by: Omar Sandoval 
>> > Signed-off-by: Dan Williams 
>> > ---
>> >  block/blk-core.c   |1 +
>> >  block/genhd.c  |7 +++
>> >  drivers/scsi/sd.c  |   41 +
>> >  include/linux/blkdev.h |1 +
>> >  include/linux/genhd.h  |   17 +
>> >  5 files changed, 59 insertions(+), 8 deletions(-)
>> >
>> Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
>> linux-block), which attempts to solve the same problem.
>
> Hi, Hannes,
>
> It's not the same problem. Jan's series fixes a bdi vs. inode lifetime
> issue, this patch is for a bdi vs devt lifetime issue. Jan's series
> doesn't fix the crashes caused by my reproducer script.

Correct. In fact I was running Jan's patches in my baseline kernel
that fails almost immediately.


Re: [PATCH v2 1/2] nvmem: sunxi-sid: add support for H3 and A64's SID controller

2017-01-29 Thread maxime . ripard
On Sun, Jan 29, 2017 at 09:56:40AM +0800, Icenowy Zheng wrote:
> H3 and A64 SoCs have a bigger SID controller, which has its direct read
> address at 0x200 position in the SID block, not 0x0.
> 
> Also, H3 SID controller has some silicon bug that makes the direct read
> value wrong at first, add code to workaround the bug. (This bug has
> already been fixed on A64 and later SoCs)
> 
> Signed-off-by: Icenowy Zheng 

Please split that into several patches. One to allow to set the size
through the structure, one to support the A64, and one to support the
H3.

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [PATCH v5 03/11] pwm: imx: Add separate set of pwm ops for PWMv1 and PWMv2

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 08:23:12AM +0100, Thierry Reding wrote:
> On Sun, Jan 29, 2017 at 10:54:07PM +0100, Lukasz Majewski wrote:
> > From: Lukasz Majewski 
> > 
> > This patch provides separate set of pwm ops utilized by
> > i.MX's PWMv1 and PWMv2.
> > 
> > Signed-off-by: Lothar Waßmann 
> > Signed-off-by: Bhuvanchandra DV 
> > Signed-off-by: Lukasz Majewski 
> > Acked-by: Shawn Guo 
> > Reviewed-by: Sascha Hauer 
> > ---
> > Changes for v5:
> > - None
> > 
> > Changes for v4:
> > - None
> > 
> > Changes for v3:
> > - Adjust the code to work with ipg clock removed
> > 
> > Changes for v2:
> > - New patch
> > ---
> >  drivers/pwm/pwm-imx.c | 17 ++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> > index b1d1e50..0fa480d 100644
> > --- a/drivers/pwm/pwm-imx.c
> > +++ b/drivers/pwm/pwm-imx.c
> > @@ -239,7 +239,14 @@ static void imx_pwm_disable(struct pwm_chip *chip, 
> > struct pwm_device *pwm)
> > clk_disable_unprepare(imx->clk_per);
> >  }
> >  
> > -static struct pwm_ops imx_pwm_ops = {
> > +static struct pwm_ops imx_pwm_ops_v1 = {
> > +   .enable = imx_pwm_enable,
> > +   .disable = imx_pwm_disable,
> > +   .config = imx_pwm_config,
> > +   .owner = THIS_MODULE,
> > +};
> > +
> > +static struct pwm_ops imx_pwm_ops_v2 = {
> 
> Can't these two be const? No need to respin for only this, just let me
> know and I can make the change while applying.

Nevermind that. I just remembered that I had picked up a patch to make
the original imx_pwm_ops a const and things still work fine if I make
both of the above const, so I just had to manually apply your patch, but
other than that it seems fine. Let me apply the rest of this set and
push out. It'd be great if you could check afterwards that it's all
still what you expect.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 03/11] pwm: imx: Add separate set of pwm ops for PWMv1 and PWMv2

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 08:23:12AM +0100, Thierry Reding wrote:
> On Sun, Jan 29, 2017 at 10:54:07PM +0100, Lukasz Majewski wrote:
> > From: Lukasz Majewski 
> > 
> > This patch provides separate set of pwm ops utilized by
> > i.MX's PWMv1 and PWMv2.
> > 
> > Signed-off-by: Lothar Waßmann 
> > Signed-off-by: Bhuvanchandra DV 
> > Signed-off-by: Lukasz Majewski 
> > Acked-by: Shawn Guo 
> > Reviewed-by: Sascha Hauer 
> > ---
> > Changes for v5:
> > - None
> > 
> > Changes for v4:
> > - None
> > 
> > Changes for v3:
> > - Adjust the code to work with ipg clock removed
> > 
> > Changes for v2:
> > - New patch
> > ---
> >  drivers/pwm/pwm-imx.c | 17 ++---
> >  1 file changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> > index b1d1e50..0fa480d 100644
> > --- a/drivers/pwm/pwm-imx.c
> > +++ b/drivers/pwm/pwm-imx.c
> > @@ -239,7 +239,14 @@ static void imx_pwm_disable(struct pwm_chip *chip, 
> > struct pwm_device *pwm)
> > clk_disable_unprepare(imx->clk_per);
> >  }
> >  
> > -static struct pwm_ops imx_pwm_ops = {
> > +static struct pwm_ops imx_pwm_ops_v1 = {
> > +   .enable = imx_pwm_enable,
> > +   .disable = imx_pwm_disable,
> > +   .config = imx_pwm_config,
> > +   .owner = THIS_MODULE,
> > +};
> > +
> > +static struct pwm_ops imx_pwm_ops_v2 = {
> 
> Can't these two be const? No need to respin for only this, just let me
> know and I can make the change while applying.

Nevermind that. I just remembered that I had picked up a patch to make
the original imx_pwm_ops a const and things still work fine if I make
both of the above const, so I just had to manually apply your patch, but
other than that it seems fine. Let me apply the rest of this set and
push out. It'd be great if you could check afterwards that it's all
still what you expect.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v2 00/10] clk: sunxi-ng: Add support for A80 CCUs

2017-01-29 Thread Maxime Ripard
Hi,

On Sat, Jan 28, 2017 at 08:22:29PM +0800, Chen-Yu Tsai wrote:
> Hi everyone,
> 
> This is v2 of my A80 CCU clk patches. Changes since v1:
> 
>   - Use pre-divider adjusted parent rate for rounding.
> 
>   - Use else statement for the case where the PLL lock status bit is
> in same register.
> 
>   - Add a more detailed description of the main CCU and DE CCU to the
> commit messages.
> 
>   - Fix DE CCU compatible string in DT binding example.
> 
>   - Fix incorrectly squashed patch hunk.
> 
>   - Drop leading zeros from device tree node name in DT examples.
> 
>   - Expanded commit message for "ARM: dts: sun8i-a23-q8-tablet: Drop
> pinmux setting for codec PA gpio".
> 
> This series adds new "sunxi-ng" style drivers for the CCUs found in the
> Allwinner A80 SoC. The A80 contains 1 main clock control unit, and some
> subsystem specific clock control units at separate addresses. These
> include the USB, display engine, and MMC.
> 
>   - The MMC clocks can be supported by the old clock drivers,
> hence here we do not add a new driver for it.
> 
>   - The old USB clock driver is intertwined with other SoCs,
> requires old style bindings with clock-output-names and
> CLK_OF_DECLARE for its parents. It is easier to switch
> to a new binding and driver.
> 
>   - The display engine (DE) CCU was not supported in the past.
> 
> The A80 CCU also has some quirks about its design. It has
> 
>   - Separate registers for PLL lock status
> 
>   - P1, P2 dividers, which are power-of-2 and only 1 bit wide
> 
> The first 3 patches fix and extend the behavior of sunxi-ng's
> mux clock type, based on the behavior of the clk subsystem's
> basic mux clock.
> 
> The fourth patch adds support for checking PLL lock status
> bits in separate registers, as opposed to within the PLL's
> config register.
> 
> Patches 5 through 7 add drivers for the CCU blocks.
>
> Patch 8 and 9 do some cleanup of the sunxi/allwinner dts files
> prior to switching sun9i dts to the new sunxi-ng clock bindings.
> These are independent of the clk stuff, but touch the same lines
> for sun9i. Including them should make it easier to apply and test
> patches.
> 
> Patch 10 has sun9i switch over to the new clock bindings.
> 
> Please take a look and let me know what you think.


This is a bit late, but I took it in anyway.

Note that I only applied the patches about the CCU. The pinctrl header
stuff is quite conflict heavy, and not really a big deal anyway.

(and I really would have liked to have it as a separate series. This
has nothing to do with the A80 CCU).

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [PATCH v2 00/10] clk: sunxi-ng: Add support for A80 CCUs

2017-01-29 Thread Maxime Ripard
Hi,

On Sat, Jan 28, 2017 at 08:22:29PM +0800, Chen-Yu Tsai wrote:
> Hi everyone,
> 
> This is v2 of my A80 CCU clk patches. Changes since v1:
> 
>   - Use pre-divider adjusted parent rate for rounding.
> 
>   - Use else statement for the case where the PLL lock status bit is
> in same register.
> 
>   - Add a more detailed description of the main CCU and DE CCU to the
> commit messages.
> 
>   - Fix DE CCU compatible string in DT binding example.
> 
>   - Fix incorrectly squashed patch hunk.
> 
>   - Drop leading zeros from device tree node name in DT examples.
> 
>   - Expanded commit message for "ARM: dts: sun8i-a23-q8-tablet: Drop
> pinmux setting for codec PA gpio".
> 
> This series adds new "sunxi-ng" style drivers for the CCUs found in the
> Allwinner A80 SoC. The A80 contains 1 main clock control unit, and some
> subsystem specific clock control units at separate addresses. These
> include the USB, display engine, and MMC.
> 
>   - The MMC clocks can be supported by the old clock drivers,
> hence here we do not add a new driver for it.
> 
>   - The old USB clock driver is intertwined with other SoCs,
> requires old style bindings with clock-output-names and
> CLK_OF_DECLARE for its parents. It is easier to switch
> to a new binding and driver.
> 
>   - The display engine (DE) CCU was not supported in the past.
> 
> The A80 CCU also has some quirks about its design. It has
> 
>   - Separate registers for PLL lock status
> 
>   - P1, P2 dividers, which are power-of-2 and only 1 bit wide
> 
> The first 3 patches fix and extend the behavior of sunxi-ng's
> mux clock type, based on the behavior of the clk subsystem's
> basic mux clock.
> 
> The fourth patch adds support for checking PLL lock status
> bits in separate registers, as opposed to within the PLL's
> config register.
> 
> Patches 5 through 7 add drivers for the CCU blocks.
>
> Patch 8 and 9 do some cleanup of the sunxi/allwinner dts files
> prior to switching sun9i dts to the new sunxi-ng clock bindings.
> These are independent of the clk stuff, but touch the same lines
> for sun9i. Including them should make it easier to apply and test
> patches.
> 
> Patch 10 has sun9i switch over to the new clock bindings.
> 
> Please take a look and let me know what you think.


This is a bit late, but I took it in anyway.

Note that I only applied the patches about the CCU. The pinctrl header
stuff is quite conflict heavy, and not really a big deal anyway.

(and I really would have liked to have it as a separate series. This
has nothing to do with the A80 CCU).

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


RE: [PATCH v2 1/2] PCI: Xilinx NWL: Modifying irq chip for legacy interrupts

2017-01-29 Thread Bharat Kumar Gogada
> The subject line is not very descriptive. How about "Enforce level
> triggering for legacy interrupts"?
> 
> On 25/01/17 08:52, Bharat Kumar Gogada wrote:
> > - Few wifi end points which only support legacy interrupts,
> > performs hardware reset functionalities after disabling interrupts
> > by invoking disable_irq and then re-enable using enable_irq, they
> > enable hardware interrupts first and then virtual irq line later.
> > - The legacy irq line goes low only after DEASSERT_INTx is
> > received.As the legacy irq line is high immediately after hardware
> > interrupts are enabled but virq of EP is still in disabled state
> > and EP handler is never executed resulting no DEASSERT_INTx.If dummy
> > irq chip is used, interrutps are not masked and system is
> 
> interrupts
> 
> > hanging with CPU stall.
> > - Adding irq chip functions instead of dummy irq chip for legacy
> > interrupts.
> > - Legacy interrupts are level sensitive, so using handle_level_irq
> > is more appropriate as it is masks interrupts until End point handles
> > interrupts and unmasks interrutps after End point handler is executed.
> 
>  interrupts
> 
> >
> > Signed-off-by: Bharat Kumar Gogada 
> > ---
> >  drivers/pci/host/pcie-xilinx-nwl.c | 36
> +++-
> >  1 file changed, 35 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/host/pcie-xilinx-nwl.c 
> > b/drivers/pci/host/pcie-xilinx-
> nwl.c
> > index 43eaa4a..6ac3e1d 100644
> > --- a/drivers/pci/host/pcie-xilinx-nwl.c
> > +++ b/drivers/pci/host/pcie-xilinx-nwl.c
> > @@ -395,10 +395,44 @@ static void nwl_pcie_msi_handler_low(struct
> irq_desc *desc)
> > chained_irq_exit(chip, desc);
> >  }
> >
> > +static void nwl_mask_leg_irq(struct irq_data *data)
> > +{
> > +   struct irq_desc *desc = irq_to_desc(data->irq);
> > +   struct nwl_pcie *pcie;
> > +   u32 mask;
> > +   u32 val;
> > +
> > +   pcie = irq_desc_get_chip_data(desc);
> > +   mask = 1 << (data->hwirq - 1);
> > +   val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
> > +   nwl_bridge_writel(pcie, (val & (~mask)), MSGF_LEG_MASK);
> 
> Oh please! Think of the following:
> 
>   cpu0cpu1
>   read
>   read
>   write
>   write
> 
> How can you make this reliable if you don't have any form of mutual
> exclusion that spans both mask and unmask, and ensures the atomicity of
> the RMW sequence?
> 
Agreed, will send with locks.


RE: [PATCH v2 1/2] PCI: Xilinx NWL: Modifying irq chip for legacy interrupts

2017-01-29 Thread Bharat Kumar Gogada
> The subject line is not very descriptive. How about "Enforce level
> triggering for legacy interrupts"?
> 
> On 25/01/17 08:52, Bharat Kumar Gogada wrote:
> > - Few wifi end points which only support legacy interrupts,
> > performs hardware reset functionalities after disabling interrupts
> > by invoking disable_irq and then re-enable using enable_irq, they
> > enable hardware interrupts first and then virtual irq line later.
> > - The legacy irq line goes low only after DEASSERT_INTx is
> > received.As the legacy irq line is high immediately after hardware
> > interrupts are enabled but virq of EP is still in disabled state
> > and EP handler is never executed resulting no DEASSERT_INTx.If dummy
> > irq chip is used, interrutps are not masked and system is
> 
> interrupts
> 
> > hanging with CPU stall.
> > - Adding irq chip functions instead of dummy irq chip for legacy
> > interrupts.
> > - Legacy interrupts are level sensitive, so using handle_level_irq
> > is more appropriate as it is masks interrupts until End point handles
> > interrupts and unmasks interrutps after End point handler is executed.
> 
>  interrupts
> 
> >
> > Signed-off-by: Bharat Kumar Gogada 
> > ---
> >  drivers/pci/host/pcie-xilinx-nwl.c | 36
> +++-
> >  1 file changed, 35 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/host/pcie-xilinx-nwl.c 
> > b/drivers/pci/host/pcie-xilinx-
> nwl.c
> > index 43eaa4a..6ac3e1d 100644
> > --- a/drivers/pci/host/pcie-xilinx-nwl.c
> > +++ b/drivers/pci/host/pcie-xilinx-nwl.c
> > @@ -395,10 +395,44 @@ static void nwl_pcie_msi_handler_low(struct
> irq_desc *desc)
> > chained_irq_exit(chip, desc);
> >  }
> >
> > +static void nwl_mask_leg_irq(struct irq_data *data)
> > +{
> > +   struct irq_desc *desc = irq_to_desc(data->irq);
> > +   struct nwl_pcie *pcie;
> > +   u32 mask;
> > +   u32 val;
> > +
> > +   pcie = irq_desc_get_chip_data(desc);
> > +   mask = 1 << (data->hwirq - 1);
> > +   val = nwl_bridge_readl(pcie, MSGF_LEG_MASK);
> > +   nwl_bridge_writel(pcie, (val & (~mask)), MSGF_LEG_MASK);
> 
> Oh please! Think of the following:
> 
>   cpu0cpu1
>   read
>   read
>   write
>   write
> 
> How can you make this reliable if you don't have any form of mutual
> exclusion that spans both mask and unmask, and ensures the atomicity of
> the RMW sequence?
> 
Agreed, will send with locks.


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Omar Sandoval
On Mon, Jan 30, 2017 at 08:05:52AM +0100, Hannes Reinecke wrote:
> On 01/29/2017 05:58 AM, Dan Williams wrote:
> > Warnings of the following form occur because scsi reuses a devt number
> > while the block layer still has it referenced as the name of the bdi
> > [1]:
> > 
> >  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
> >  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
> >  [..]
> >  Call Trace:
> >   dump_stack+0x86/0xc3
> >   __warn+0xcb/0xf0
> >   warn_slowpath_fmt+0x5f/0x80
> >   ? kernfs_path_from_node+0x4f/0x60
> >   sysfs_warn_dup+0x62/0x80
> >   sysfs_create_dir_ns+0x77/0x90
> >   kobject_add_internal+0xb2/0x350
> >   kobject_add+0x75/0xd0
> >   device_add+0x15a/0x650
> >   device_create_groups_vargs+0xe0/0xf0
> >   device_create_vargs+0x1c/0x20
> >   bdi_register+0x90/0x240
> >   ? lockdep_init_map+0x57/0x200
> >   bdi_register_owner+0x36/0x60
> >   device_add_disk+0x1bb/0x4e0
> >   ? __pm_runtime_use_autosuspend+0x5c/0x70
> >   sd_probe_async+0x10d/0x1c0
> >   async_run_entry_fn+0x39/0x170
> > 
> > This is a brute-force fix to pass the devt release information from
> > sd_probe() to the locations where we register the bdi,
> > device_add_disk(), and unregister the bdi, blk_cleanup_queue().
> > 
> > Thanks to Omar for the quick reproducer script [2]. This patch survives
> > where an unmodified kernel fails in a few seconds.
> > 
> > [1]: https://marc.info/?l=linux-scsi=147116857810716=4
> > [2]: http://marc.info/?l=linux-block=148554717109098=2
> > 
> > Cc: James Bottomley 
> > Cc: Bart Van Assche 
> > Cc: "Martin K. Petersen" 
> > Cc: Christoph Hellwig 
> > Cc: Jens Axboe 
> > Reported-by: Omar Sandoval 
> > Signed-off-by: Dan Williams 
> > ---
> >  block/blk-core.c   |1 +
> >  block/genhd.c  |7 +++
> >  drivers/scsi/sd.c  |   41 +
> >  include/linux/blkdev.h |1 +
> >  include/linux/genhd.h  |   17 +
> >  5 files changed, 59 insertions(+), 8 deletions(-)
> > 
> Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
> linux-block), which attempts to solve the same problem.

Hi, Hannes,

It's not the same problem. Jan's series fixes a bdi vs. inode lifetime
issue, this patch is for a bdi vs devt lifetime issue. Jan's series
doesn't fix the crashes caused by my reproducer script.


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Omar Sandoval
On Mon, Jan 30, 2017 at 08:05:52AM +0100, Hannes Reinecke wrote:
> On 01/29/2017 05:58 AM, Dan Williams wrote:
> > Warnings of the following form occur because scsi reuses a devt number
> > while the block layer still has it referenced as the name of the bdi
> > [1]:
> > 
> >  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
> >  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
> >  [..]
> >  Call Trace:
> >   dump_stack+0x86/0xc3
> >   __warn+0xcb/0xf0
> >   warn_slowpath_fmt+0x5f/0x80
> >   ? kernfs_path_from_node+0x4f/0x60
> >   sysfs_warn_dup+0x62/0x80
> >   sysfs_create_dir_ns+0x77/0x90
> >   kobject_add_internal+0xb2/0x350
> >   kobject_add+0x75/0xd0
> >   device_add+0x15a/0x650
> >   device_create_groups_vargs+0xe0/0xf0
> >   device_create_vargs+0x1c/0x20
> >   bdi_register+0x90/0x240
> >   ? lockdep_init_map+0x57/0x200
> >   bdi_register_owner+0x36/0x60
> >   device_add_disk+0x1bb/0x4e0
> >   ? __pm_runtime_use_autosuspend+0x5c/0x70
> >   sd_probe_async+0x10d/0x1c0
> >   async_run_entry_fn+0x39/0x170
> > 
> > This is a brute-force fix to pass the devt release information from
> > sd_probe() to the locations where we register the bdi,
> > device_add_disk(), and unregister the bdi, blk_cleanup_queue().
> > 
> > Thanks to Omar for the quick reproducer script [2]. This patch survives
> > where an unmodified kernel fails in a few seconds.
> > 
> > [1]: https://marc.info/?l=linux-scsi=147116857810716=4
> > [2]: http://marc.info/?l=linux-block=148554717109098=2
> > 
> > Cc: James Bottomley 
> > Cc: Bart Van Assche 
> > Cc: "Martin K. Petersen" 
> > Cc: Christoph Hellwig 
> > Cc: Jens Axboe 
> > Reported-by: Omar Sandoval 
> > Signed-off-by: Dan Williams 
> > ---
> >  block/blk-core.c   |1 +
> >  block/genhd.c  |7 +++
> >  drivers/scsi/sd.c  |   41 +
> >  include/linux/blkdev.h |1 +
> >  include/linux/genhd.h  |   17 +
> >  5 files changed, 59 insertions(+), 8 deletions(-)
> > 
> Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
> linux-block), which attempts to solve the same problem.

Hi, Hannes,

It's not the same problem. Jan's series fixes a bdi vs. inode lifetime
issue, this patch is for a bdi vs devt lifetime issue. Jan's series
doesn't fix the crashes caused by my reproducer script.


scsi: use-after-free in sg_start_req

2017-01-29 Thread Dmitry Vyukov
Hello,

The following program triggers use-after-free in sg_start_req:
https://gist.githubusercontent.com/dvyukov/be6561d2819fe30a78711234e53866b8/raw/1d75d4508f7a8ebb0b1ec0d18c0054fbffbc0708/gistfile1.txt

BUG: KASAN: use-after-free in bio_copy_user_iov+0xee1/0xf00
block/bio.c:1248 at addr 8801c8c3ed00
Read of size 8 by task /9023
CPU: 0 PID: 9023 Comm:  Not tainted 4.9.0 #5
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
 8801d451f420 82346bdf  11003a8a3e17
 ed003a8a3e0f 41b58ab3 84b37e38 823468f1
 813183a6 8801d451f0e0  
Call Trace:
 [] __dump_stack lib/dump_stack.c:15 [inline]
 [] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:161
 [] print_address_description mm/kasan/report.c:199 [inline]
 [] kasan_report_error+0x1d1/0x4d0 mm/kasan/report.c:288
 [] kasan_report mm/kasan/report.c:308 [inline]
 [] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:329
 [] bio_copy_user_iov+0xee1/0xf00 block/bio.c:1248
 [] __blk_rq_map_user_iov block/blk-map.c:56 [inline]
 [] blk_rq_map_user_iov+0x2c5/0x970 block/blk-map.c:133
 [] blk_rq_map_user+0x134/0x1d0 block/blk-map.c:163
 [] sg_start_req drivers/scsi/sg.c:1758 [inline]
 [] sg_common_write.isra.20+0x12b1/0x1b00
drivers/scsi/sg.c:772
 [] sg_write+0x785/0xda0 drivers/scsi/sg.c:675
 [] __vfs_write+0x5b1/0x740 fs/read_write.c:510
 [] vfs_write+0x170/0x4e0 fs/read_write.c:560
 [] SYSC_write fs/read_write.c:607 [inline]
 [] SyS_write+0xfb/0x230 fs/read_write.c:599
 [] entry_SYSCALL_64_fastpath+0x1f/0xc2
Object at 8801c8c3ed00, in cache kmalloc-256 size: 256
Allocated:
PID = 9032
 [   52.586815] [] save_stack_trace+0x16/0x20
arch/x86/kernel/stacktrace.c:57
 [   52.594037] [] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [   52.600735] [] set_track mm/kasan/kasan.c:507 [inline]
 [   52.600735] [] kasan_kmalloc+0xaa/0xd0
mm/kasan/kasan.c:598
 [   52.607700] [] __do_kmalloc mm/slab.c:3729 [inline]
 [   52.607700] [] __kmalloc+0x12c/0x690 mm/slab.c:3738
 [   52.614520] [] kmalloc include/linux/slab.h:495 [inline]
 [   52.614520] [] kzalloc include/linux/slab.h:636 [inline]
 [   52.614520] [] sg_build_sgat
drivers/scsi/sg.c:1808 [inline]
 [   52.614520] []
sg_build_indirect.isra.19+0x8b/0x540 drivers/scsi/sg.c:1834
 [   52.622591] [] sg_build_reserve+0x8d/0xb0
drivers/scsi/sg.c:1965
 [   52.629815] [] sg_add_sfp drivers/scsi/sg.c:2152 [inline]
 [   52.629815] [] sg_open+0xcb1/0x15b0 drivers/scsi/sg.c:329
 [   52.636503] [] chrdev_open+0x253/0x6b0 fs/char_dev.c:392
 [   52.643451] [] do_dentry_open+0x6ca/0xc50 fs/open.c:753
 [   52.650660] [] vfs_open+0x105/0x220 fs/open.c:866
 [   52.657351] [] do_last fs/namei.c:3374 [inline]
 [   52.657351] [] path_openat+0x100f/0x3830 fs/namei.c:3497
 [   52.664488] [] do_filp_open+0x288/0x3f0 fs/namei.c:3532
 [   52.671538] [] do_sys_open+0x535/0x710 fs/open.c:1053
 [   52.678484] [] SYSC_open fs/open.c:1071 [inline]
 [   52.678484] [] SyS_open+0x2d/0x40 fs/open.c:1066
 [   52.685000] [] entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 9032
 [   52.697636] [] save_stack_trace+0x16/0x20
arch/x86/kernel/stacktrace.c:57
 [   52.704842] [] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [   52.711522] [] set_track mm/kasan/kasan.c:507 [inline]
 [   52.711522] [] kasan_slab_free+0x6f/0xb0
mm/kasan/kasan.c:571
 [   52.718640] [] __cache_free mm/slab.c:3507 [inline]
 [   52.718640] [] kfree+0xd3/0x250 mm/slab.c:3824
 [   52.724979] []
sg_remove_scat.isra.16+0x212/0x2d0 drivers/scsi/sg.c:1916
 [   52.732879] [] sg_ioctl+0x1903/0x3840
drivers/scsi/sg.c:970
 [   52.739745] [] vfs_ioctl fs/ioctl.c:43 [inline]
 [   52.739745] [] do_vfs_ioctl+0x1bf/0x1630 fs/ioctl.c:679
 [   52.746866] [] SYSC_ioctl fs/ioctl.c:694 [inline]
 [   52.746866] [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
 [   52.753478] [] entry_SYSCALL_64_fastpath+0x1f/0xc2

On commit ca63ff9b11f958efafd8c8fa60fda14baec6149c


scsi: use-after-free in sg_start_req

2017-01-29 Thread Dmitry Vyukov
Hello,

The following program triggers use-after-free in sg_start_req:
https://gist.githubusercontent.com/dvyukov/be6561d2819fe30a78711234e53866b8/raw/1d75d4508f7a8ebb0b1ec0d18c0054fbffbc0708/gistfile1.txt

BUG: KASAN: use-after-free in bio_copy_user_iov+0xee1/0xf00
block/bio.c:1248 at addr 8801c8c3ed00
Read of size 8 by task /9023
CPU: 0 PID: 9023 Comm:  Not tainted 4.9.0 #5
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
 8801d451f420 82346bdf  11003a8a3e17
 ed003a8a3e0f 41b58ab3 84b37e38 823468f1
 813183a6 8801d451f0e0  
Call Trace:
 [] __dump_stack lib/dump_stack.c:15 [inline]
 [] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 [] kasan_object_err+0x1c/0x70 mm/kasan/report.c:161
 [] print_address_description mm/kasan/report.c:199 [inline]
 [] kasan_report_error+0x1d1/0x4d0 mm/kasan/report.c:288
 [] kasan_report mm/kasan/report.c:308 [inline]
 [] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:329
 [] bio_copy_user_iov+0xee1/0xf00 block/bio.c:1248
 [] __blk_rq_map_user_iov block/blk-map.c:56 [inline]
 [] blk_rq_map_user_iov+0x2c5/0x970 block/blk-map.c:133
 [] blk_rq_map_user+0x134/0x1d0 block/blk-map.c:163
 [] sg_start_req drivers/scsi/sg.c:1758 [inline]
 [] sg_common_write.isra.20+0x12b1/0x1b00
drivers/scsi/sg.c:772
 [] sg_write+0x785/0xda0 drivers/scsi/sg.c:675
 [] __vfs_write+0x5b1/0x740 fs/read_write.c:510
 [] vfs_write+0x170/0x4e0 fs/read_write.c:560
 [] SYSC_write fs/read_write.c:607 [inline]
 [] SyS_write+0xfb/0x230 fs/read_write.c:599
 [] entry_SYSCALL_64_fastpath+0x1f/0xc2
Object at 8801c8c3ed00, in cache kmalloc-256 size: 256
Allocated:
PID = 9032
 [   52.586815] [] save_stack_trace+0x16/0x20
arch/x86/kernel/stacktrace.c:57
 [   52.594037] [] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [   52.600735] [] set_track mm/kasan/kasan.c:507 [inline]
 [   52.600735] [] kasan_kmalloc+0xaa/0xd0
mm/kasan/kasan.c:598
 [   52.607700] [] __do_kmalloc mm/slab.c:3729 [inline]
 [   52.607700] [] __kmalloc+0x12c/0x690 mm/slab.c:3738
 [   52.614520] [] kmalloc include/linux/slab.h:495 [inline]
 [   52.614520] [] kzalloc include/linux/slab.h:636 [inline]
 [   52.614520] [] sg_build_sgat
drivers/scsi/sg.c:1808 [inline]
 [   52.614520] []
sg_build_indirect.isra.19+0x8b/0x540 drivers/scsi/sg.c:1834
 [   52.622591] [] sg_build_reserve+0x8d/0xb0
drivers/scsi/sg.c:1965
 [   52.629815] [] sg_add_sfp drivers/scsi/sg.c:2152 [inline]
 [   52.629815] [] sg_open+0xcb1/0x15b0 drivers/scsi/sg.c:329
 [   52.636503] [] chrdev_open+0x253/0x6b0 fs/char_dev.c:392
 [   52.643451] [] do_dentry_open+0x6ca/0xc50 fs/open.c:753
 [   52.650660] [] vfs_open+0x105/0x220 fs/open.c:866
 [   52.657351] [] do_last fs/namei.c:3374 [inline]
 [   52.657351] [] path_openat+0x100f/0x3830 fs/namei.c:3497
 [   52.664488] [] do_filp_open+0x288/0x3f0 fs/namei.c:3532
 [   52.671538] [] do_sys_open+0x535/0x710 fs/open.c:1053
 [   52.678484] [] SYSC_open fs/open.c:1071 [inline]
 [   52.678484] [] SyS_open+0x2d/0x40 fs/open.c:1066
 [   52.685000] [] entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 9032
 [   52.697636] [] save_stack_trace+0x16/0x20
arch/x86/kernel/stacktrace.c:57
 [   52.704842] [] save_stack+0x43/0xd0 mm/kasan/kasan.c:495
 [   52.711522] [] set_track mm/kasan/kasan.c:507 [inline]
 [   52.711522] [] kasan_slab_free+0x6f/0xb0
mm/kasan/kasan.c:571
 [   52.718640] [] __cache_free mm/slab.c:3507 [inline]
 [   52.718640] [] kfree+0xd3/0x250 mm/slab.c:3824
 [   52.724979] []
sg_remove_scat.isra.16+0x212/0x2d0 drivers/scsi/sg.c:1916
 [   52.732879] [] sg_ioctl+0x1903/0x3840
drivers/scsi/sg.c:970
 [   52.739745] [] vfs_ioctl fs/ioctl.c:43 [inline]
 [   52.739745] [] do_vfs_ioctl+0x1bf/0x1630 fs/ioctl.c:679
 [   52.746866] [] SYSC_ioctl fs/ioctl.c:694 [inline]
 [   52.746866] [] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685
 [   52.753478] [] entry_SYSCALL_64_fastpath+0x1f/0xc2

On commit ca63ff9b11f958efafd8c8fa60fda14baec6149c


Re: [PATCH v5 05/11] pwm: imx: Move PWMv2 software reset code to a separate function

2017-01-29 Thread Thierry Reding
On Sun, Jan 29, 2017 at 10:54:09PM +0100, Lukasz Majewski wrote:
> From: Lukasz Majewski 
> 
> The software reset code has been extracted from imx_pwm_config_v2 function
> and moved to new one - imx_pwm_sw_reset().
> 
> This change reduces the overall size of imx_pwm_config_v2() and prepares
> it for atomic PWM operation.
> 
> Suggested-by: Stefan Agner 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Lukasz Majewski 
> 
> ---
> Changes for v5:
> - None
> 
> Changes for v4:
> - None
> 
> Changes for v3:
> - None
> 
> Changes for v2:
> - Add missing parenthesis
> ---
>  drivers/pwm/pwm-imx.c | 31 +--
>  1 file changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index 11e3f3e..f0d78f3 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -120,6 +120,25 @@ static void imx_pwm_disable_v1(struct pwm_chip *chip, 
> struct pwm_device *pwm)
>   clk_disable_unprepare(imx->clk_per);
>  }
>  
> +static void imx_pwm_sw_reset(struct pwm_chip *chip)
> +{
> + struct imx_chip *imx = to_imx_chip(chip);
> + struct device *dev = chip->dev;
> + int wait_count = 0;
> + u32 cr;
> +
> + writel(MX3_PWMCR_SWR, imx->mmio_base + MX3_PWMCR);
> + do {
> + usleep_range(200, 1000);
> + cr = readl(imx->mmio_base + MX3_PWMCR);
> + } while ((cr & MX3_PWMCR_SWR) &&
> +  (wait_count++ < MX3_PWM_SWR_LOOP));

I think you could replace this by one of the accessors from
linux/iopoll.h, but that can be a separate patch.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 05/11] pwm: imx: Move PWMv2 software reset code to a separate function

2017-01-29 Thread Thierry Reding
On Sun, Jan 29, 2017 at 10:54:09PM +0100, Lukasz Majewski wrote:
> From: Lukasz Majewski 
> 
> The software reset code has been extracted from imx_pwm_config_v2 function
> and moved to new one - imx_pwm_sw_reset().
> 
> This change reduces the overall size of imx_pwm_config_v2() and prepares
> it for atomic PWM operation.
> 
> Suggested-by: Stefan Agner 
> Suggested-by: Boris Brezillon 
> Signed-off-by: Lukasz Majewski 
> 
> ---
> Changes for v5:
> - None
> 
> Changes for v4:
> - None
> 
> Changes for v3:
> - None
> 
> Changes for v2:
> - Add missing parenthesis
> ---
>  drivers/pwm/pwm-imx.c | 31 +--
>  1 file changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index 11e3f3e..f0d78f3 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -120,6 +120,25 @@ static void imx_pwm_disable_v1(struct pwm_chip *chip, 
> struct pwm_device *pwm)
>   clk_disable_unprepare(imx->clk_per);
>  }
>  
> +static void imx_pwm_sw_reset(struct pwm_chip *chip)
> +{
> + struct imx_chip *imx = to_imx_chip(chip);
> + struct device *dev = chip->dev;
> + int wait_count = 0;
> + u32 cr;
> +
> + writel(MX3_PWMCR_SWR, imx->mmio_base + MX3_PWMCR);
> + do {
> + usleep_range(200, 1000);
> + cr = readl(imx->mmio_base + MX3_PWMCR);
> + } while ((cr & MX3_PWMCR_SWR) &&
> +  (wait_count++ < MX3_PWM_SWR_LOOP));

I think you could replace this by one of the accessors from
linux/iopoll.h, but that can be a separate patch.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 03/11] pwm: imx: Add separate set of pwm ops for PWMv1 and PWMv2

2017-01-29 Thread Thierry Reding
On Sun, Jan 29, 2017 at 10:54:07PM +0100, Lukasz Majewski wrote:
> From: Lukasz Majewski 
> 
> This patch provides separate set of pwm ops utilized by
> i.MX's PWMv1 and PWMv2.
> 
> Signed-off-by: Lothar Waßmann 
> Signed-off-by: Bhuvanchandra DV 
> Signed-off-by: Lukasz Majewski 
> Acked-by: Shawn Guo 
> Reviewed-by: Sascha Hauer 
> ---
> Changes for v5:
> - None
> 
> Changes for v4:
> - None
> 
> Changes for v3:
> - Adjust the code to work with ipg clock removed
> 
> Changes for v2:
> - New patch
> ---
>  drivers/pwm/pwm-imx.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index b1d1e50..0fa480d 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -239,7 +239,14 @@ static void imx_pwm_disable(struct pwm_chip *chip, 
> struct pwm_device *pwm)
>   clk_disable_unprepare(imx->clk_per);
>  }
>  
> -static struct pwm_ops imx_pwm_ops = {
> +static struct pwm_ops imx_pwm_ops_v1 = {
> + .enable = imx_pwm_enable,
> + .disable = imx_pwm_disable,
> + .config = imx_pwm_config,
> + .owner = THIS_MODULE,
> +};
> +
> +static struct pwm_ops imx_pwm_ops_v2 = {

Can't these two be const? No need to respin for only this, just let me
know and I can make the change while applying.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 03/11] pwm: imx: Add separate set of pwm ops for PWMv1 and PWMv2

2017-01-29 Thread Thierry Reding
On Sun, Jan 29, 2017 at 10:54:07PM +0100, Lukasz Majewski wrote:
> From: Lukasz Majewski 
> 
> This patch provides separate set of pwm ops utilized by
> i.MX's PWMv1 and PWMv2.
> 
> Signed-off-by: Lothar Waßmann 
> Signed-off-by: Bhuvanchandra DV 
> Signed-off-by: Lukasz Majewski 
> Acked-by: Shawn Guo 
> Reviewed-by: Sascha Hauer 
> ---
> Changes for v5:
> - None
> 
> Changes for v4:
> - None
> 
> Changes for v3:
> - Adjust the code to work with ipg clock removed
> 
> Changes for v2:
> - New patch
> ---
>  drivers/pwm/pwm-imx.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pwm/pwm-imx.c b/drivers/pwm/pwm-imx.c
> index b1d1e50..0fa480d 100644
> --- a/drivers/pwm/pwm-imx.c
> +++ b/drivers/pwm/pwm-imx.c
> @@ -239,7 +239,14 @@ static void imx_pwm_disable(struct pwm_chip *chip, 
> struct pwm_device *pwm)
>   clk_disable_unprepare(imx->clk_per);
>  }
>  
> -static struct pwm_ops imx_pwm_ops = {
> +static struct pwm_ops imx_pwm_ops_v1 = {
> + .enable = imx_pwm_enable,
> + .disable = imx_pwm_disable,
> + .config = imx_pwm_config,
> + .owner = THIS_MODULE,
> +};
> +
> +static struct pwm_ops imx_pwm_ops_v2 = {

Can't these two be const? No need to respin for only this, just let me
know and I can make the change while applying.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v5 2/8] PCI: Allow runtime PM on Thunderbolt ports

2017-01-29 Thread Rafael J. Wysocki
On Sat, Jan 28, 2017 at 11:09 PM, Bjorn Helgaas  wrote:
> On Sun, Jan 15, 2017 at 09:03:45PM +0100, Lukas Wunner wrote:
>> Currently PCIe ports are only allowed to go to D3 if the BIOS is dated
>> 2015 or newer to avoid potential issues with old chipsets.  However for
>> Thunderbolt we know that even the oldest controller, Light Ridge (2010),
>> is able to suspend its ports to D3 just fine.
>>
>> We're about to add runtime PM for Thunderbolt on the Mac.  Apple has
>> released two EFI security updates in 2015 which encompass all machines
>> with Thunderbolt, but the achieved power saving should be made available
>> to users even if they haven't updated their BIOS.  To this end,
>> special-case Thunderbolt in pci_bridge_d3_possible().
>
> I think this whole paragraph is unnecessary detail.  I first thought
> you had some connection with a firmware security issue, but now I see
> the only point is that if you have pre-2015 firmware, you could update
> it since newer firmware is available.
>
>> This allows the Thunderbolt controller to power down but the root port
>> to which the Thunderbolt controller is attached remains in D0 unless
>> the EFI update is installed.  Users can pass pcie_port_pm=force on the
>> kernel command line if they cannot install the EFI update but still want
>> to benefit from the additional power saving of putting the root port
>> into D3.  In practice, root ports can be suspended to D3 without issues
>> at least on 2012 Ivy Bridge machines.
>
> I'm not sure I like advertising pcie_port_pm=force.  That just means a
> few leet folks will use this parameter and run in a subtly different
> configuration than everybody else, and possibly trip over subtly
> different issues.  The audience (users who read kernel change logs and
> are willing to use special boot parameters, but who can't install an
> EFI update) seems small.

That basically is for somebody who has a product and knows that the
feature works there, but doesn't want or simply can't patch the kernel
(which is shipped by a distro or similar).

Thanks,
Rafael


Re: [PATCH v5 2/8] PCI: Allow runtime PM on Thunderbolt ports

2017-01-29 Thread Rafael J. Wysocki
On Sat, Jan 28, 2017 at 11:09 PM, Bjorn Helgaas  wrote:
> On Sun, Jan 15, 2017 at 09:03:45PM +0100, Lukas Wunner wrote:
>> Currently PCIe ports are only allowed to go to D3 if the BIOS is dated
>> 2015 or newer to avoid potential issues with old chipsets.  However for
>> Thunderbolt we know that even the oldest controller, Light Ridge (2010),
>> is able to suspend its ports to D3 just fine.
>>
>> We're about to add runtime PM for Thunderbolt on the Mac.  Apple has
>> released two EFI security updates in 2015 which encompass all machines
>> with Thunderbolt, but the achieved power saving should be made available
>> to users even if they haven't updated their BIOS.  To this end,
>> special-case Thunderbolt in pci_bridge_d3_possible().
>
> I think this whole paragraph is unnecessary detail.  I first thought
> you had some connection with a firmware security issue, but now I see
> the only point is that if you have pre-2015 firmware, you could update
> it since newer firmware is available.
>
>> This allows the Thunderbolt controller to power down but the root port
>> to which the Thunderbolt controller is attached remains in D0 unless
>> the EFI update is installed.  Users can pass pcie_port_pm=force on the
>> kernel command line if they cannot install the EFI update but still want
>> to benefit from the additional power saving of putting the root port
>> into D3.  In practice, root ports can be suspended to D3 without issues
>> at least on 2012 Ivy Bridge machines.
>
> I'm not sure I like advertising pcie_port_pm=force.  That just means a
> few leet folks will use this parameter and run in a subtly different
> configuration than everybody else, and possibly trip over subtly
> different issues.  The audience (users who read kernel change logs and
> are willing to use special boot parameters, but who can't install an
> EFI update) seems small.

That basically is for somebody who has a product and knows that the
feature works there, but doesn't want or simply can't patch the kernel
(which is shipped by a distro or similar).

Thanks,
Rafael


Re: [PATCH V2 1/2] ACPI: processor_perflib: Simplify code and stop using CPUFREQ_START

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 5:29 AM, Viresh Kumar  wrote:
> acpi_processor_ppc_notifier() can live without using CPUFREQ_START
> (which is gonna be removed soon), as it is only used while setting
> ignore_ppc to 0. This can be done with the help of "ignore_ppc < 0"
> check alone. The notifier function anyway ignores all events except
> CPUFREQ_ADJUST and dropping CPUFREQ_START wouldn't harm at all.
>
> Once CPUFREQ_START event is removed from the cpufreq core,
> acpi_processor_ppc_notifier() will get called only for CPUFREQ_NOTIFY or
> CPUFREQ_ADJUST event. Drop the return statement from the first if block
> to make sure we don't ignore any such events.
>
> Signed-off-by: Viresh Kumar 
>
> ---
> V1->V2:
> - Improved changelog
> - Don't move the first if block to a later point, as it becomes useless
>   then.
> ---
>  drivers/acpi/processor_perflib.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/processor_perflib.c 
> b/drivers/acpi/processor_perflib.c
> index f0b4a981b8d3..18b72eec3507 100644
> --- a/drivers/acpi/processor_perflib.c
> +++ b/drivers/acpi/processor_perflib.c
> @@ -75,10 +75,8 @@ static int acpi_processor_ppc_notifier(struct 
> notifier_block *nb,
> struct acpi_processor *pr;
> unsigned int ppc = 0;
>
> -   if (event == CPUFREQ_START && ignore_ppc <= 0) {
> +   if (ignore_ppc < 0)
> ignore_ppc = 0;
> -   return 0;
> -   }

Don't we want to return from here if ignore_ppc is 0?

>
> if (ignore_ppc)
> return 0;
> --

Thanks,
Rafael


Re: [PATCH V2 1/2] ACPI: processor_perflib: Simplify code and stop using CPUFREQ_START

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 5:29 AM, Viresh Kumar  wrote:
> acpi_processor_ppc_notifier() can live without using CPUFREQ_START
> (which is gonna be removed soon), as it is only used while setting
> ignore_ppc to 0. This can be done with the help of "ignore_ppc < 0"
> check alone. The notifier function anyway ignores all events except
> CPUFREQ_ADJUST and dropping CPUFREQ_START wouldn't harm at all.
>
> Once CPUFREQ_START event is removed from the cpufreq core,
> acpi_processor_ppc_notifier() will get called only for CPUFREQ_NOTIFY or
> CPUFREQ_ADJUST event. Drop the return statement from the first if block
> to make sure we don't ignore any such events.
>
> Signed-off-by: Viresh Kumar 
>
> ---
> V1->V2:
> - Improved changelog
> - Don't move the first if block to a later point, as it becomes useless
>   then.
> ---
>  drivers/acpi/processor_perflib.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/processor_perflib.c 
> b/drivers/acpi/processor_perflib.c
> index f0b4a981b8d3..18b72eec3507 100644
> --- a/drivers/acpi/processor_perflib.c
> +++ b/drivers/acpi/processor_perflib.c
> @@ -75,10 +75,8 @@ static int acpi_processor_ppc_notifier(struct 
> notifier_block *nb,
> struct acpi_processor *pr;
> unsigned int ppc = 0;
>
> -   if (event == CPUFREQ_START && ignore_ppc <= 0) {
> +   if (ignore_ppc < 0)
> ignore_ppc = 0;
> -   return 0;
> -   }

Don't we want to return from here if ignore_ppc is 0?

>
> if (ignore_ppc)
> return 0;
> --

Thanks,
Rafael


Re: [PATCH] Usb: host - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 07:45:21AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 30, 2017 at 10:36:29AM +0530, Shailendra Verma wrote:
> > of_device_get_match_data could return NULL, and so can cause
> > a NULL pointer dereference later.
> > 
> > Signed-off-by: Shailendra Verma 
> > ---
> >  drivers/usb/host/xhci-tegra.c |4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
> > index a59fafb..890c778 100644
> > --- a/drivers/usb/host/xhci-tegra.c
> > +++ b/drivers/usb/host/xhci-tegra.c
> > @@ -903,6 +903,10 @@ static int tegra_xusb_probe(struct platform_device 
> > *pdev)
> > return -ENOMEM;
> >  
> > tegra->soc = of_device_get_match_data(>dev);
> > +   if (!tegra->soc) {
> 
> How would the driver be loaded and the probe function called if this
> returns NULL?
> 
> Is this ever possible?

No, it isn't. I've been NAK'ing this kind of patch for a while now.
There are two variants of this patch going about:

  1) checking the return value of of_match_device()
  2) checking the return value of of_device_get_match_data()

The same may also apply to of_match_node(), but I haven't seen that used
very much lately.

For of_match_device() the problem could technically occur if used in non
OF setups, because the device could be instantiated by hand in board
setup code. Tegra has been OF-only for a couple of years now, so there
is no way this can happen today.

of_device_get_match_data() is somewhat more complicated because it could
still return NULL if the OF table entry had its .data field set to NULL.
However in all drivers that I know that would be considered a bug, so
might as well let things crash at this point to make it immediately
obvious.

I had once been tempted to write a checkpatch rule for this, but I'm not
sure it's as easy as just warning if there's a check, because there are
some legitimate cases, even if they're very rare.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] I2c: busses - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:33:07AM +0530, Shailendra Verma wrote:
> of_device_get_match_data could return NULL, and so can cause
> a NULL pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/i2c/busses/i2c-tegra.c |4 
>  1 file changed, 4 insertions(+)

This will never happen. Any match in the OF table that would cause the
->probe() to occur has a valid .data pointer associated with it.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] Usb: host - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 07:45:21AM +0100, Greg Kroah-Hartman wrote:
> On Mon, Jan 30, 2017 at 10:36:29AM +0530, Shailendra Verma wrote:
> > of_device_get_match_data could return NULL, and so can cause
> > a NULL pointer dereference later.
> > 
> > Signed-off-by: Shailendra Verma 
> > ---
> >  drivers/usb/host/xhci-tegra.c |4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
> > index a59fafb..890c778 100644
> > --- a/drivers/usb/host/xhci-tegra.c
> > +++ b/drivers/usb/host/xhci-tegra.c
> > @@ -903,6 +903,10 @@ static int tegra_xusb_probe(struct platform_device 
> > *pdev)
> > return -ENOMEM;
> >  
> > tegra->soc = of_device_get_match_data(>dev);
> > +   if (!tegra->soc) {
> 
> How would the driver be loaded and the probe function called if this
> returns NULL?
> 
> Is this ever possible?

No, it isn't. I've been NAK'ing this kind of patch for a while now.
There are two variants of this patch going about:

  1) checking the return value of of_match_device()
  2) checking the return value of of_device_get_match_data()

The same may also apply to of_match_node(), but I haven't seen that used
very much lately.

For of_match_device() the problem could technically occur if used in non
OF setups, because the device could be instantiated by hand in board
setup code. Tegra has been OF-only for a couple of years now, so there
is no way this can happen today.

of_device_get_match_data() is somewhat more complicated because it could
still return NULL if the OF table entry had its .data field set to NULL.
However in all drivers that I know that would be considered a bug, so
might as well let things crash at this point to make it immediately
obvious.

I had once been tempted to write a checkpatch rule for this, but I'm not
sure it's as easy as just warning if there's a check, because there are
some legitimate cases, even if they're very rare.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] I2c: busses - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:33:07AM +0530, Shailendra Verma wrote:
> of_device_get_match_data could return NULL, and so can cause
> a NULL pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/i2c/busses/i2c-tegra.c |4 
>  1 file changed, 4 insertions(+)

This will never happen. Any match in the OF table that would cause the
->probe() to occur has a valid .data pointer associated with it.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH v6 3/5] cpuidle:powernv: Add helper function to populate powernv idle states.

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 4:47 AM, Michael Ellerman  wrote:
> "Gautham R. Shenoy"  writes:
>
>> From: "Gautham R. Shenoy" 
>>
>> In the current code for powernv_add_idle_states, there is a lot of code
>> duplication while initializing an idle state in powernv_states table.
>>
>> Add an inline helper function to populate the powernv_states[] table
>> for a given idle state. Invoke this for populating the "Nap",
>> "Fastsleep" and the stop states in powernv_add_idle_states.
>>
>> Acked-by: Balbir Singh 
>> Signed-off-by: Gautham R. Shenoy 
>> ---
>>  drivers/cpuidle/cpuidle-powernv.c | 89 
>> +++
>>  include/linux/cpuidle.h   |  1 +
>
> I was going to merge this, but I see you've touched cpuidle.h, so I feel
> like I should get an ACK from the cpuidle folks.
>
> It's a fairly uncontroversial change, but it's their API.

OK, please add an ACK from me to it then.

Thanks,
Rafael


Re: [PATCH v6 3/5] cpuidle:powernv: Add helper function to populate powernv idle states.

2017-01-29 Thread Rafael J. Wysocki
On Mon, Jan 30, 2017 at 4:47 AM, Michael Ellerman  wrote:
> "Gautham R. Shenoy"  writes:
>
>> From: "Gautham R. Shenoy" 
>>
>> In the current code for powernv_add_idle_states, there is a lot of code
>> duplication while initializing an idle state in powernv_states table.
>>
>> Add an inline helper function to populate the powernv_states[] table
>> for a given idle state. Invoke this for populating the "Nap",
>> "Fastsleep" and the stop states in powernv_add_idle_states.
>>
>> Acked-by: Balbir Singh 
>> Signed-off-by: Gautham R. Shenoy 
>> ---
>>  drivers/cpuidle/cpuidle-powernv.c | 89 
>> +++
>>  include/linux/cpuidle.h   |  1 +
>
> I was going to merge this, but I see you've touched cpuidle.h, so I feel
> like I should get an ACK from the cpuidle folks.
>
> It's a fairly uncontroversial change, but it's their API.

OK, please add an ACK from me to it then.

Thanks,
Rafael


Re: [PATCH] Gpu: drm: tegra - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:23:45AM +0530, Shailendra Verma wrote:
> of_match_device could return NULL, and so can cause a NULL
> pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/gpu/drm/tegra/sor.c |4 
>  1 file changed, 4 insertions(+)

No, this will never happen on Tegra. If you reach the ->probe() function
this pointer is guaranteed to be non-NULL.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] Gpu: drm: tegra - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:23:45AM +0530, Shailendra Verma wrote:
> of_match_device could return NULL, and so can cause a NULL
> pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/gpu/drm/tegra/sor.c |4 
>  1 file changed, 4 insertions(+)

No, this will never happen on Tegra. If you reach the ->probe() function
this pointer is guaranteed to be non-NULL.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] pwm - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:22:33AM +0530, Shailendra Verma wrote:
> of_match_device could return NULL, and so can cause a NULL
> pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/pwm/pwm-sun4i.c |4 
>  drivers/pwm/pwm-tegra.c |5 +
>  2 files changed, 9 insertions(+)

Both of these drivers will only run on purely OF systems, so this check
is unnecessary.

Thierry


signature.asc
Description: PGP signature


Re: [PATCH] pwm - Fix possible NULL derefrence.

2017-01-29 Thread Thierry Reding
On Mon, Jan 30, 2017 at 10:22:33AM +0530, Shailendra Verma wrote:
> of_match_device could return NULL, and so can cause a NULL
> pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/pwm/pwm-sun4i.c |4 
>  drivers/pwm/pwm-tegra.c |5 +
>  2 files changed, 9 insertions(+)

Both of these drivers will only run on purely OF systems, so this check
is unnecessary.

Thierry


signature.asc
Description: PGP signature


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Hannes Reinecke
On 01/29/2017 05:58 AM, Dan Williams wrote:
> Warnings of the following form occur because scsi reuses a devt number
> while the block layer still has it referenced as the name of the bdi
> [1]:
> 
>  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
>  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
>  [..]
>  Call Trace:
>   dump_stack+0x86/0xc3
>   __warn+0xcb/0xf0
>   warn_slowpath_fmt+0x5f/0x80
>   ? kernfs_path_from_node+0x4f/0x60
>   sysfs_warn_dup+0x62/0x80
>   sysfs_create_dir_ns+0x77/0x90
>   kobject_add_internal+0xb2/0x350
>   kobject_add+0x75/0xd0
>   device_add+0x15a/0x650
>   device_create_groups_vargs+0xe0/0xf0
>   device_create_vargs+0x1c/0x20
>   bdi_register+0x90/0x240
>   ? lockdep_init_map+0x57/0x200
>   bdi_register_owner+0x36/0x60
>   device_add_disk+0x1bb/0x4e0
>   ? __pm_runtime_use_autosuspend+0x5c/0x70
>   sd_probe_async+0x10d/0x1c0
>   async_run_entry_fn+0x39/0x170
> 
> This is a brute-force fix to pass the devt release information from
> sd_probe() to the locations where we register the bdi,
> device_add_disk(), and unregister the bdi, blk_cleanup_queue().
> 
> Thanks to Omar for the quick reproducer script [2]. This patch survives
> where an unmodified kernel fails in a few seconds.
> 
> [1]: https://marc.info/?l=linux-scsi=147116857810716=4
> [2]: http://marc.info/?l=linux-block=148554717109098=2
> 
> Cc: James Bottomley 
> Cc: Bart Van Assche 
> Cc: "Martin K. Petersen" 
> Cc: Christoph Hellwig 
> Cc: Jens Axboe 
> Reported-by: Omar Sandoval 
> Signed-off-by: Dan Williams 
> ---
>  block/blk-core.c   |1 +
>  block/genhd.c  |7 +++
>  drivers/scsi/sd.c  |   41 +
>  include/linux/blkdev.h |1 +
>  include/linux/genhd.h  |   17 +
>  5 files changed, 59 insertions(+), 8 deletions(-)
> 
Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
linux-block), which attempts to solve the same problem.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-29 Thread Hannes Reinecke
On 01/29/2017 05:58 AM, Dan Williams wrote:
> Warnings of the following form occur because scsi reuses a devt number
> while the block layer still has it referenced as the name of the bdi
> [1]:
> 
>  WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
>  sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
>  [..]
>  Call Trace:
>   dump_stack+0x86/0xc3
>   __warn+0xcb/0xf0
>   warn_slowpath_fmt+0x5f/0x80
>   ? kernfs_path_from_node+0x4f/0x60
>   sysfs_warn_dup+0x62/0x80
>   sysfs_create_dir_ns+0x77/0x90
>   kobject_add_internal+0xb2/0x350
>   kobject_add+0x75/0xd0
>   device_add+0x15a/0x650
>   device_create_groups_vargs+0xe0/0xf0
>   device_create_vargs+0x1c/0x20
>   bdi_register+0x90/0x240
>   ? lockdep_init_map+0x57/0x200
>   bdi_register_owner+0x36/0x60
>   device_add_disk+0x1bb/0x4e0
>   ? __pm_runtime_use_autosuspend+0x5c/0x70
>   sd_probe_async+0x10d/0x1c0
>   async_run_entry_fn+0x39/0x170
> 
> This is a brute-force fix to pass the devt release information from
> sd_probe() to the locations where we register the bdi,
> device_add_disk(), and unregister the bdi, blk_cleanup_queue().
> 
> Thanks to Omar for the quick reproducer script [2]. This patch survives
> where an unmodified kernel fails in a few seconds.
> 
> [1]: https://marc.info/?l=linux-scsi=147116857810716=4
> [2]: http://marc.info/?l=linux-block=148554717109098=2
> 
> Cc: James Bottomley 
> Cc: Bart Van Assche 
> Cc: "Martin K. Petersen" 
> Cc: Christoph Hellwig 
> Cc: Jens Axboe 
> Reported-by: Omar Sandoval 
> Signed-off-by: Dan Williams 
> ---
>  block/blk-core.c   |1 +
>  block/genhd.c  |7 +++
>  drivers/scsi/sd.c  |   41 +
>  include/linux/blkdev.h |1 +
>  include/linux/genhd.h  |   17 +
>  5 files changed, 59 insertions(+), 8 deletions(-)
> 
Please check the patchset from Jan Kara (cf 'BDI lifetime fix' on
linux-block), which attempts to solve the same problem.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


linux-next: Tree for Jan 30

2017-01-29 Thread Stephen Rothwell
Hi all,

There will be no linux-next release until Monday (next-20170130).

Changes since 20170125:

New tree: spi-nor

The drm tree lost its build failure.

The block tree gained a build failure so I used the version from
next-20170125.

The tip tree gained a conflict against the drm-misc-fixes tree.

The rcu tree gained a conflict against the powerpc-fixes tree.

The staging tree gained a conflict against the staging.current tree.

The gpio tree gained a conflict against the staging tree.

Non-merge commits (relative to Linus' tree): 5603
 6347 files changed, 228529 insertions(+), 117188 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 36 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (2c5d9555d6d9 Merge branch 'parisc-4.10-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (9aed02feae57 ARC: [arcompact] handle unaligned 
access delay slot corner case)
Merging arm-current/fixes (90f92c631b21 ARM: 8613/1: Fix the uaccess crash on 
PB11MPCore)
Merging m68k-current/for-linus (ad595b77c4a8 m68k/atari: Use seq_puts() in 
atari_get_hardware_list())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (b5fa0f7f88ed powerpc: Fix build failure with clang 
due to BUILD_BUG_ON())
Merging sparc/master (5d0e7705774d sparc: Fixed typo in sstate.c. Replaced 
panicing with panicking)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (1b1bc42c1692 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging ipsec/master (4e5da369df64 Documentation/networking: fix typo in 
mpls-sysctl)
Merging netfilter/master (a47b70ea86bd ravb: unmap descriptors when freeing 
rings)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (2b1d530cb315 MAINTAINERS: ath9k-devel is 
closed)
Merging mac80211/master (115865fa0826 mac80211: don't try to sleep in 
rate_control_rate_init())
Merging sound-current/for-linus (6cf4569ce356 Merge tag 'asoc-fix-v4.10-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (672980c62c68 PCI/ASPM: Handle PCI-to-PCIe 
bridges as roots of PCIe hierarchies)
Merging driver-core.current/driver-core-linus (49def1853334 Linux 4.10-rc4)
Merging tty.current/tty-linus (49def1853334 Linux 4.10-rc4)
Merging usb.current/usb-linus (a3683e0c1410 Merge tag 'usb-serial-4.10-rc6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus)
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (5d03a2fd2292 USB: serial: option: add 
device ID for HP lt2523 (Novatel E371))
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus 

linux-next: Tree for Jan 30

2017-01-29 Thread Stephen Rothwell
Hi all,

There will be no linux-next release until Monday (next-20170130).

Changes since 20170125:

New tree: spi-nor

The drm tree lost its build failure.

The block tree gained a build failure so I used the version from
next-20170125.

The tip tree gained a conflict against the drm-misc-fixes tree.

The rcu tree gained a conflict against the powerpc-fixes tree.

The staging tree gained a conflict against the staging.current tree.

The gpio tree gained a conflict against the staging tree.

Non-merge commits (relative to Linus' tree): 5603
 6347 files changed, 228529 insertions(+), 117188 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig (with
CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a
native build of tools/perf. After the final fixups (if any), I do an
x86_64 modules_install followed by builds for x86_64 allnoconfig,
powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig
and pseries_le_defconfig and i386, sparc and sparc64 defconfig.

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 36 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (2c5d9555d6d9 Merge branch 'parisc-4.10-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux)
Merging fixes/master (30066ce675d3 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging kbuild-current/rc-fixes (c7858bf16c0b asm-prototypes: Clear any CPP 
defines before declaring the functions)
Merging arc-current/for-curr (9aed02feae57 ARC: [arcompact] handle unaligned 
access delay slot corner case)
Merging arm-current/fixes (90f92c631b21 ARM: 8613/1: Fix the uaccess crash on 
PB11MPCore)
Merging m68k-current/for-linus (ad595b77c4a8 m68k/atari: Use seq_puts() in 
atari_get_hardware_list())
Merging metag-fixes/fixes (35d04077ad96 metag: Only define 
atomic_dec_if_positive conditionally)
Merging powerpc-fixes/fixes (b5fa0f7f88ed powerpc: Fix build failure with clang 
due to BUILD_BUG_ON())
Merging sparc/master (5d0e7705774d sparc: Fixed typo in sstate.c. Replaced 
panicing with panicking)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (1b1bc42c1692 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net)
Merging ipsec/master (4e5da369df64 Documentation/networking: fix typo in 
mpls-sysctl)
Merging netfilter/master (a47b70ea86bd ravb: unmap descriptors when freeing 
rings)
Merging ipvs/master (045169816b31 Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6)
Merging wireless-drivers/master (2b1d530cb315 MAINTAINERS: ath9k-devel is 
closed)
Merging mac80211/master (115865fa0826 mac80211: don't try to sleep in 
rate_control_rate_init())
Merging sound-current/for-linus (6cf4569ce356 Merge tag 'asoc-fix-v4.10-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus)
Merging pci-current/for-linus (672980c62c68 PCI/ASPM: Handle PCI-to-PCIe 
bridges as roots of PCIe hierarchies)
Merging driver-core.current/driver-core-linus (49def1853334 Linux 4.10-rc4)
Merging tty.current/tty-linus (49def1853334 Linux 4.10-rc4)
Merging usb.current/usb-linus (a3683e0c1410 Merge tag 'usb-serial-4.10-rc6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus)
Merging usb-gadget-fixes/fixes (efe357f4633a usb: dwc2: host: fix 
Wmaybe-uninitialized warning)
Merging usb-serial-fixes/usb-linus (5d03a2fd2292 USB: serial: option: add 
device ID for HP lt2523 (Novatel E371))
Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move 
the lock initialization to core file)
Merging phy/fixes (7ce7d89f4883 Linux 4.10-rc1)
Merging staging.current/staging-linus 

Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-29 Thread Krzysztof Kozlowski
On Mon, Jan 30, 2017 at 7:23 AM, Olof Johansson  wrote:
> Hi Krzysztof,
>
> On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
>> Hi,
>>
>> On top of previous pull request.
>>
>> This adds proper clocks to LPASS node on Exynos5433 which is needed
>> by Marek's patchset:
>>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
>> () samsung ! com
>>
>>
>> Cc: Marek Szyprowski 
>> Cc: Sylwester Nawrocki 
>> Cc: Linus Walleij 
>> Cc: Tomasz Figa 
>> Cc: Lee Jones 
>>
>> Best regards,
>> Krzysztof
>>
>>
>> The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
>>
>>   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
>> 18:20:28 +0200)
>>
>> are available in the git repository at:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
>> tags/samsung-dt64-4.11-2
>>
>> for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
>>
>>   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
>> 22:04:20 +0200)
>
> I think you tagged the wrong branch here. The log message shows the right hash
> at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
> seems to be on your for-next.
>
> Care to respin, so your tag and our merged branch match up?

Ahhh, damn it. I'll fix it.

Best regards,
Krzysztof


Re: [PATCH] Usb: host - Fix possible NULL derefrence.

2017-01-29 Thread Greg Kroah-Hartman
On Mon, Jan 30, 2017 at 10:36:29AM +0530, Shailendra Verma wrote:
> of_device_get_match_data could return NULL, and so can cause
> a NULL pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/usb/host/xhci-tegra.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
> index a59fafb..890c778 100644
> --- a/drivers/usb/host/xhci-tegra.c
> +++ b/drivers/usb/host/xhci-tegra.c
> @@ -903,6 +903,10 @@ static int tegra_xusb_probe(struct platform_device *pdev)
>   return -ENOMEM;
>  
>   tegra->soc = of_device_get_match_data(>dev);
> + if (!tegra->soc) {

How would the driver be loaded and the probe function called if this
returns NULL?

Is this ever possible?

thanks,

greg k-h


Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-29 Thread Krzysztof Kozlowski
On Mon, Jan 30, 2017 at 7:23 AM, Olof Johansson  wrote:
> Hi Krzysztof,
>
> On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
>> Hi,
>>
>> On top of previous pull request.
>>
>> This adds proper clocks to LPASS node on Exynos5433 which is needed
>> by Marek's patchset:
>>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
>> () samsung ! com
>>
>>
>> Cc: Marek Szyprowski 
>> Cc: Sylwester Nawrocki 
>> Cc: Linus Walleij 
>> Cc: Tomasz Figa 
>> Cc: Lee Jones 
>>
>> Best regards,
>> Krzysztof
>>
>>
>> The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
>>
>>   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
>> 18:20:28 +0200)
>>
>> are available in the git repository at:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
>> tags/samsung-dt64-4.11-2
>>
>> for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
>>
>>   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
>> 22:04:20 +0200)
>
> I think you tagged the wrong branch here. The log message shows the right hash
> at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
> seems to be on your for-next.
>
> Care to respin, so your tag and our merged branch match up?

Ahhh, damn it. I'll fix it.

Best regards,
Krzysztof


Re: [PATCH] Usb: host - Fix possible NULL derefrence.

2017-01-29 Thread Greg Kroah-Hartman
On Mon, Jan 30, 2017 at 10:36:29AM +0530, Shailendra Verma wrote:
> of_device_get_match_data could return NULL, and so can cause
> a NULL pointer dereference later.
> 
> Signed-off-by: Shailendra Verma 
> ---
>  drivers/usb/host/xhci-tegra.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
> index a59fafb..890c778 100644
> --- a/drivers/usb/host/xhci-tegra.c
> +++ b/drivers/usb/host/xhci-tegra.c
> @@ -903,6 +903,10 @@ static int tegra_xusb_probe(struct platform_device *pdev)
>   return -ENOMEM;
>  
>   tegra->soc = of_device_get_match_data(>dev);
> + if (!tegra->soc) {

How would the driver be loaded and the probe function called if this
returns NULL?

Is this ever possible?

thanks,

greg k-h


Re: linux-next: Tree for Jan 30

2017-01-29 Thread Stephen Rothwell
Hi all,

On Mon, 30 Jan 2017 17:49:01 +1100 Stephen Rothwell  
wrote:
>
> There will be no linux-next release until Monday (next-20170130).

Obviously, this is not longer relevant :-)

-- 
Cheers,
Stephen Rothwell


Re: linux-next: Tree for Jan 30

2017-01-29 Thread Stephen Rothwell
Hi all,

On Mon, 30 Jan 2017 17:49:01 +1100 Stephen Rothwell  
wrote:
>
> There will be no linux-next release until Monday (next-20170130).

Obviously, this is not longer relevant :-)

-- 
Cheers,
Stephen Rothwell


Re: linux-next: build failure after merge of the block tree

2017-01-29 Thread Christoph Hellwig
On Sun, Jan 29, 2017 at 06:53:42PM -0700, Jens Axboe wrote:
> Huh, I wonder how that snuck past my allmodconfig builds, that looks
> like a clear failure.

I also did tons of test builds and never saw it, not sure why
the NVMe-SCSI code still someone how an implicit include of scsi_cmnd.h.

But in the end it should not be using the defintion anyway, and I sent
a patch on Saturday so that it doesn't:

[PATCH 1/5] nvme/scsi: don't rely on BLK_MAX_CDB

might make sense to expedite that.


Re: linux-next: build failure after merge of the block tree

2017-01-29 Thread Christoph Hellwig
On Sun, Jan 29, 2017 at 06:53:42PM -0700, Jens Axboe wrote:
> Huh, I wonder how that snuck past my allmodconfig builds, that looks
> like a clear failure.

I also did tons of test builds and never saw it, not sure why
the NVMe-SCSI code still someone how an implicit include of scsi_cmnd.h.

But in the end it should not be using the defintion anyway, and I sent
a patch on Saturday so that it doesn't:

[PATCH 1/5] nvme/scsi: don't rely on BLK_MAX_CDB

might make sense to expedite that.


Re: [GIT PULL] STi DT update for v4.11 round 2

2017-01-29 Thread Olof Johansson
On Fri, Jan 27, 2017 at 05:15:32PM +, Patrice CHOTARD wrote:
> Here is the correct one
> 
> PLease consider this second round of STi dts update for v4.11 :
> 
> The following changes since commit 2016ead446b98c42dffd9b6c03ce813e5cb3b810:
> 
>ARM: dts: STiH407-family: Supply Mailbox properties to delta RProc 
> (2017-01-12 17:23:39 +0100)
> 
> are available in the git repository at:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> sti-dt-for-v4.11-round2
> 
> for you to fetch changes up to c58736c160c1346bedda77d739f0f85710fa00cf:
> 
>ARM: dts: STiH407-family: Add missing pwm irq (2017-01-27 17:34:03 +0100)
> 
> 
> 
> STi DT fix:
> 
> Add missing interrupt node to properly probe the pwm device.

Merged, thanks!


-Olof



Re: [GIT PULL] STi DT update for v4.11 round 2

2017-01-29 Thread Olof Johansson
On Fri, Jan 27, 2017 at 05:15:32PM +, Patrice CHOTARD wrote:
> Here is the correct one
> 
> PLease consider this second round of STi dts update for v4.11 :
> 
> The following changes since commit 2016ead446b98c42dffd9b6c03ce813e5cb3b810:
> 
>ARM: dts: STiH407-family: Supply Mailbox properties to delta RProc 
> (2017-01-12 17:23:39 +0100)
> 
> are available in the git repository at:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> sti-dt-for-v4.11-round2
> 
> for you to fetch changes up to c58736c160c1346bedda77d739f0f85710fa00cf:
> 
>ARM: dts: STiH407-family: Add missing pwm irq (2017-01-27 17:34:03 +0100)
> 
> 
> 
> STi DT fix:
> 
> Add missing interrupt node to properly probe the pwm device.

Merged, thanks!


-Olof



Re: [PATCH 0/8] staging: lustre: lnet: change wire protocol typedefs to proper structure

2017-01-29 Thread Greg Kroah-Hartman
On Sun, Jan 29, 2017 at 11:56:38PM +, James Simmons wrote:
> 
> > On Sat, 2017-01-21 at 19:40 -0500, James Simmons wrote:
> > > The upstream kernel requires proper structures so
> > > convert nearly all the LNet wire protocols typedefs in
> > > the LNet core.
> > 
> > Thanks.
> > 
> > Perhaps s/\bWIRE_ATTR\b/__packed/g one day too
> 
> I liked to keep that one.

Sorry, but no.

> The point of WIRE_ATTR isn't to be some abstraction but to label that
> struct as something that goes over the wire. This lets people know
> that it would break something if you change that structure. Looks like
> I need to send a patch that adds a comment explaning the meaning of
> WIRE_ATTR.

No, please remove it, it's not anything that any other kernel subsystem
uses.

It's easy to know if you will break something, anything that crosses the
user/kernel boundry falls into that category, so if it is in a uapi .h
file, that's going to be the case.

thanks,

greg k-h


Re: [PATCH 0/8] staging: lustre: lnet: change wire protocol typedefs to proper structure

2017-01-29 Thread Greg Kroah-Hartman
On Sun, Jan 29, 2017 at 11:56:38PM +, James Simmons wrote:
> 
> > On Sat, 2017-01-21 at 19:40 -0500, James Simmons wrote:
> > > The upstream kernel requires proper structures so
> > > convert nearly all the LNet wire protocols typedefs in
> > > the LNet core.
> > 
> > Thanks.
> > 
> > Perhaps s/\bWIRE_ATTR\b/__packed/g one day too
> 
> I liked to keep that one.

Sorry, but no.

> The point of WIRE_ATTR isn't to be some abstraction but to label that
> struct as something that goes over the wire. This lets people know
> that it would break something if you change that structure. Looks like
> I need to send a patch that adds a comment explaning the meaning of
> WIRE_ATTR.

No, please remove it, it's not anything that any other kernel subsystem
uses.

It's easy to know if you will break something, anything that crosses the
user/kernel boundry falls into that category, so if it is in a uapi .h
file, that's going to be the case.

thanks,

greg k-h


Re: Exynos5422 EHCI (USB 2.0) - Odroid XU4 - port 1 resume error -110

2017-01-29 Thread Anand Moon
Hi Krzysztof,

On 29 January 2017 at 19:05, Krzysztof Kozlowski  wrote:
> Hi,
>
> On Odroid XU4 with an external usb2514 hub (evaluation board from SMSC
> or TI) connected to the USB2.0 port of EHCI controller, whenever I
> plug some USB device into the usb2514 hub I see errors like:
> [   73.969179] exynos-ehci 1211.usb: port 1 resume error -110
> [   74.003259] usb 1-1: USB disconnect, device number 2
> [   74.017432] usb usb1-port1: cannot reset (err = -32)
> [   74.021141] usb usb1-port1: cannot reset (err = -32)
> [   74.026832] usb usb1-port1: cannot reset (err = -32)
> [   74.030974] usb usb1-port1: cannot reset (err = -32)
> [   74.036677] usb usb1-port1: cannot reset (err = -32)
> [   74.041919] usb usb1-port1: Cannot enable. Maybe the USB cable is bad?
> [   74.140923] usb usb1-port1: unable to enumerate USB device
>
> Flooding the console. USB device does not work.
>
> Tested:
> 1. next-20170125,
> 2. mainline v4.10-rc5-393-g53cd1ad1a68f,
> 3. Hardkernel vendor kernel 3.10
>
>
> It does not look like an issue of usb2514 but rather a combination of
> usb2514 + Exynos EHCI because:
> 1. Happens only when usb2514 hub is directly connected to the USB2.0
> port of EHCI controller.
> 2. If usb2514 is connected through other hub (like to other ports on
> the board) - everything works fine.
> 3. I tested two different usb2514 hubs - one from TI, second from
> SMSC. Both behave the same.
> 4. I had the same issue on DWC3 controller (USB 3.0 port) and in that
> case helped ENABLING bits DWC3_GUSB2PHYCFG_SUSPHY and
> DWC3_GUSB3PIPECTL_SUSPHY which is I think equal to disabling the
> snps,dis_u3_susphy_quirk quirks.
>
>
> It seems that issue is strictly related to Exynos5422 USB 2.0 driver
> (phy?) and this hub. The drivers involved:
>  - drivers/phy/phy-exynos5250-usb2.c
>  - drivers/phy/phy-samsung-usb2.c
>  - drivers/usb/host/ehci-exynos.c
>
> Any ideas?
>
> Best regards,
> Krzysztof

I feel vbus input configuration is missing in the OdroidXU4.
for both ehci 2.0 port and dwc3 controller.

id-gpio and vbus-gpio configuration is also missing.

I am just attaching small changes from my side but it's not working
correctly at my end.

-Best Regards
-Anand


usbvbus.patch
Description: Binary data


Re: Exynos5422 EHCI (USB 2.0) - Odroid XU4 - port 1 resume error -110

2017-01-29 Thread Anand Moon
Hi Krzysztof,

On 29 January 2017 at 19:05, Krzysztof Kozlowski  wrote:
> Hi,
>
> On Odroid XU4 with an external usb2514 hub (evaluation board from SMSC
> or TI) connected to the USB2.0 port of EHCI controller, whenever I
> plug some USB device into the usb2514 hub I see errors like:
> [   73.969179] exynos-ehci 1211.usb: port 1 resume error -110
> [   74.003259] usb 1-1: USB disconnect, device number 2
> [   74.017432] usb usb1-port1: cannot reset (err = -32)
> [   74.021141] usb usb1-port1: cannot reset (err = -32)
> [   74.026832] usb usb1-port1: cannot reset (err = -32)
> [   74.030974] usb usb1-port1: cannot reset (err = -32)
> [   74.036677] usb usb1-port1: cannot reset (err = -32)
> [   74.041919] usb usb1-port1: Cannot enable. Maybe the USB cable is bad?
> [   74.140923] usb usb1-port1: unable to enumerate USB device
>
> Flooding the console. USB device does not work.
>
> Tested:
> 1. next-20170125,
> 2. mainline v4.10-rc5-393-g53cd1ad1a68f,
> 3. Hardkernel vendor kernel 3.10
>
>
> It does not look like an issue of usb2514 but rather a combination of
> usb2514 + Exynos EHCI because:
> 1. Happens only when usb2514 hub is directly connected to the USB2.0
> port of EHCI controller.
> 2. If usb2514 is connected through other hub (like to other ports on
> the board) - everything works fine.
> 3. I tested two different usb2514 hubs - one from TI, second from
> SMSC. Both behave the same.
> 4. I had the same issue on DWC3 controller (USB 3.0 port) and in that
> case helped ENABLING bits DWC3_GUSB2PHYCFG_SUSPHY and
> DWC3_GUSB3PIPECTL_SUSPHY which is I think equal to disabling the
> snps,dis_u3_susphy_quirk quirks.
>
>
> It seems that issue is strictly related to Exynos5422 USB 2.0 driver
> (phy?) and this hub. The drivers involved:
>  - drivers/phy/phy-exynos5250-usb2.c
>  - drivers/phy/phy-samsung-usb2.c
>  - drivers/usb/host/ehci-exynos.c
>
> Any ideas?
>
> Best regards,
> Krzysztof

I feel vbus input configuration is missing in the OdroidXU4.
for both ehci 2.0 port and dwc3 controller.

id-gpio and vbus-gpio configuration is also missing.

I am just attaching small changes from my side but it's not working
correctly at my end.

-Best Regards
-Anand


usbvbus.patch
Description: Binary data


RE: [PATCH v2 2/2] PCI: Xilinx NWL: Fix, proc interrupts for legacy virtual irq shown as edge

2017-01-29 Thread Bharat Kumar Gogada
> 
> On 25/01/17 08:52, Bharat Kumar Gogada wrote:
> > - Legacy interrupts are level triggered, virtual irq line of End Point
> > shows as edge in /proc/interrupts.
> > - Setting irq flags of virtual irq line of EP to level triggered at
> > the time of mapping.
> >
> > Signed-off-by: Bharat Kumar Gogada 
> > ---
> >  drivers/pci/host/pcie-xilinx-nwl.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/pci/host/pcie-xilinx-nwl.c
> > b/drivers/pci/host/pcie-xilinx-nwl.c
> > index 6ac3e1d..1cddd1f 100644
> > --- a/drivers/pci/host/pcie-xilinx-nwl.c
> > +++ b/drivers/pci/host/pcie-xilinx-nwl.c
> > @@ -434,6 +434,7 @@ static int nwl_legacy_map(struct irq_domain
> > *domain, unsigned int irq,  {
> > irq_set_chip_and_handler(irq, _leg_irq_chip, handle_level_irq);
> > irq_set_chip_data(irq, domain->host_data);
> > +   irq_set_status_flags(irq, IRQ_LEVEL);
> >
> > return 0;
> >  }
> >
> 
> As said in my previous review [1], this should be folded in the previous 
> patch, as
> it doesn't make much sense on its own.
> 
Agreed, will send in same previous patch. 

bharat


RE: [PATCH v2 2/2] PCI: Xilinx NWL: Fix, proc interrupts for legacy virtual irq shown as edge

2017-01-29 Thread Bharat Kumar Gogada
> 
> On 25/01/17 08:52, Bharat Kumar Gogada wrote:
> > - Legacy interrupts are level triggered, virtual irq line of End Point
> > shows as edge in /proc/interrupts.
> > - Setting irq flags of virtual irq line of EP to level triggered at
> > the time of mapping.
> >
> > Signed-off-by: Bharat Kumar Gogada 
> > ---
> >  drivers/pci/host/pcie-xilinx-nwl.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/pci/host/pcie-xilinx-nwl.c
> > b/drivers/pci/host/pcie-xilinx-nwl.c
> > index 6ac3e1d..1cddd1f 100644
> > --- a/drivers/pci/host/pcie-xilinx-nwl.c
> > +++ b/drivers/pci/host/pcie-xilinx-nwl.c
> > @@ -434,6 +434,7 @@ static int nwl_legacy_map(struct irq_domain
> > *domain, unsigned int irq,  {
> > irq_set_chip_and_handler(irq, _leg_irq_chip, handle_level_irq);
> > irq_set_chip_data(irq, domain->host_data);
> > +   irq_set_status_flags(irq, IRQ_LEVEL);
> >
> > return 0;
> >  }
> >
> 
> As said in my previous review [1], this should be folded in the previous 
> patch, as
> it doesn't make much sense on its own.
> 
Agreed, will send in same previous patch. 

bharat


[RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND)

2017-01-29 Thread Anshuman Khandual
Mark all the applicable VMAs with VM_CDM explicitly during mbind(MPOL_BIND)
call if the user provided nodemask has a CDM node.

Signed-off-by: Anshuman Khandual 
---
 mm/mempolicy.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 78e095b..4482140 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -175,6 +175,16 @@ static void mpol_relative_nodemask(nodemask_t *ret, const 
nodemask_t *orig,
 }
 
 #ifdef CONFIG_COHERENT_DEVICE
+static inline void set_vm_cdm(struct vm_area_struct *vma)
+{
+   vma->vm_flags |= VM_CDM;
+}
+
+static inline void clr_vm_cdm(struct vm_area_struct *vma)
+{
+   vma->vm_flags &= ~VM_CDM;
+}
+
 static void mark_vma_cdm(nodemask_t *nmask,
struct page *page, struct vm_area_struct *vma)
 {
@@ -191,6 +201,9 @@ static void mark_vma_cdm(nodemask_t *nmask,
vma->vm_flags |= VM_CDM;
 }
 #else
+static inline void set_vm_cdm(struct vm_area_struct *vma) { }
+static inline void clr_vm_cdm(struct vm_area_struct *vma) { }
+
 static void mark_vma_cdm(nodemask_t *nmask,
struct page *page, struct vm_area_struct *vma)
 {
@@ -770,6 +783,10 @@ static int mbind_range(struct mm_struct *mm, unsigned long 
start,
vmstart = max(start, vma->vm_start);
vmend   = min(end, vma->vm_end);
 
+   if ((new_pol->mode == MPOL_BIND)
+   && nodemask_has_cdm(new_pol->v.nodes))
+   set_vm_cdm(vma);
+
if (mpol_equal(vma_policy(vma), new_pol))
continue;
 
-- 
2.9.3



[RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND)

2017-01-29 Thread Anshuman Khandual
Mark all the applicable VMAs with VM_CDM explicitly during mbind(MPOL_BIND)
call if the user provided nodemask has a CDM node.

Signed-off-by: Anshuman Khandual 
---
 mm/mempolicy.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 78e095b..4482140 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -175,6 +175,16 @@ static void mpol_relative_nodemask(nodemask_t *ret, const 
nodemask_t *orig,
 }
 
 #ifdef CONFIG_COHERENT_DEVICE
+static inline void set_vm_cdm(struct vm_area_struct *vma)
+{
+   vma->vm_flags |= VM_CDM;
+}
+
+static inline void clr_vm_cdm(struct vm_area_struct *vma)
+{
+   vma->vm_flags &= ~VM_CDM;
+}
+
 static void mark_vma_cdm(nodemask_t *nmask,
struct page *page, struct vm_area_struct *vma)
 {
@@ -191,6 +201,9 @@ static void mark_vma_cdm(nodemask_t *nmask,
vma->vm_flags |= VM_CDM;
 }
 #else
+static inline void set_vm_cdm(struct vm_area_struct *vma) { }
+static inline void clr_vm_cdm(struct vm_area_struct *vma) { }
+
 static void mark_vma_cdm(nodemask_t *nmask,
struct page *page, struct vm_area_struct *vma)
 {
@@ -770,6 +783,10 @@ static int mbind_range(struct mm_struct *mm, unsigned long 
start,
vmstart = max(start, vma->vm_start);
vmend   = min(end, vma->vm_end);
 
+   if ((new_pol->mode == MPOL_BIND)
+   && nodemask_has_cdm(new_pol->v.nodes))
+   set_vm_cdm(vma);
+
if (mpol_equal(vma_policy(vma), new_pol))
continue;
 
-- 
2.9.3



[RFC V2 09/12] mm: Exclude CDM marked VMAs from auto NUMA

2017-01-29 Thread Anshuman Khandual
Kernel cannot track device memory accesses behind VMAs containing CDM
memory. Hence all the VM_CDM marked VMAs should not be part of the auto
NUMA migration scheme. This patch also adds a new function is_cdm_vma()
to detect any VMA marked with flag VM_CDM.

Signed-off-by: Anshuman Khandual 
---
 include/linux/mempolicy.h | 14 ++
 kernel/sched/fair.c   |  3 ++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 5f4d828..ff0c6bc 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -172,6 +172,20 @@ extern int mpol_parse_str(char *str, struct mempolicy 
**mpol);
 
 extern void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol);
 
+#ifdef CONFIG_COHERENT_DEVICE
+static inline bool is_cdm_vma(struct vm_area_struct *vma)
+{
+   if (vma->vm_flags & VM_CDM)
+   return true;
+   return false;
+}
+#else
+static inline bool is_cdm_vma(struct vm_area_struct *vma)
+{
+   return false;
+}
+#endif
+
 /* Check if a vma is migratable */
 static inline bool vma_migratable(struct vm_area_struct *vma)
 {
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6559d19..523508c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2482,7 +2482,8 @@ void task_numa_work(struct callback_head *work)
}
for (; vma; vma = vma->vm_next) {
if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
-   is_vm_hugetlb_page(vma) || (vma->vm_flags & 
VM_MIXEDMAP)) {
+   is_vm_hugetlb_page(vma) || is_cdm_vma(vma) ||
+   (vma->vm_flags & VM_MIXEDMAP)) {
continue;
}
 
-- 
2.9.3



[RFC V2 09/12] mm: Exclude CDM marked VMAs from auto NUMA

2017-01-29 Thread Anshuman Khandual
Kernel cannot track device memory accesses behind VMAs containing CDM
memory. Hence all the VM_CDM marked VMAs should not be part of the auto
NUMA migration scheme. This patch also adds a new function is_cdm_vma()
to detect any VMA marked with flag VM_CDM.

Signed-off-by: Anshuman Khandual 
---
 include/linux/mempolicy.h | 14 ++
 kernel/sched/fair.c   |  3 ++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 5f4d828..ff0c6bc 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -172,6 +172,20 @@ extern int mpol_parse_str(char *str, struct mempolicy 
**mpol);
 
 extern void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol);
 
+#ifdef CONFIG_COHERENT_DEVICE
+static inline bool is_cdm_vma(struct vm_area_struct *vma)
+{
+   if (vma->vm_flags & VM_CDM)
+   return true;
+   return false;
+}
+#else
+static inline bool is_cdm_vma(struct vm_area_struct *vma)
+{
+   return false;
+}
+#endif
+
 /* Check if a vma is migratable */
 static inline bool vma_migratable(struct vm_area_struct *vma)
 {
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6559d19..523508c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2482,7 +2482,8 @@ void task_numa_work(struct callback_head *work)
}
for (; vma; vma = vma->vm_next) {
if (!vma_migratable(vma) || !vma_policy_mof(vma) ||
-   is_vm_hugetlb_page(vma) || (vma->vm_flags & 
VM_MIXEDMAP)) {
+   is_vm_hugetlb_page(vma) || is_cdm_vma(vma) ||
+   (vma->vm_flags & VM_MIXEDMAP)) {
continue;
}
 
-- 
2.9.3



[PATCH] Pinctrl: mvebu - Fix possible NULL derefrence.

2017-01-29 Thread Shailendra Verma
of_match_device could return NULL, and so can cause a NULL
pointer dereference later.

Signed-off-by: Shailendra Verma 
---
 drivers/pinctrl/mvebu/pinctrl-dove.c |4 
 drivers/pinctrl/mvebu/pinctrl-kirkwood.c |4 
 drivers/pinctrl/mvebu/pinctrl-orion.c|4 
 3 files changed, 12 insertions(+)

diff --git a/drivers/pinctrl/mvebu/pinctrl-dove.c 
b/drivers/pinctrl/mvebu/pinctrl-dove.c
index f93ae0d..f9fe6d4 100644
--- a/drivers/pinctrl/mvebu/pinctrl-dove.c
+++ b/drivers/pinctrl/mvebu/pinctrl-dove.c
@@ -769,6 +769,10 @@ static int dove_pinctrl_probe(struct platform_device *pdev)
struct resource fb_res;
const struct of_device_id *match =
of_match_device(dove_pinctrl_of_match, >dev);
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void *)match->data;
 
/*
diff --git a/drivers/pinctrl/mvebu/pinctrl-kirkwood.c 
b/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
index 5f89c26..75efb83 100644
--- a/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
+++ b/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
@@ -472,6 +472,10 @@ static int kirkwood_pinctrl_probe(struct platform_device 
*pdev)
struct resource *res;
const struct of_device_id *match =
of_match_device(kirkwood_pinctrl_of_match, >dev);
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void *)match->data;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
diff --git a/drivers/pinctrl/mvebu/pinctrl-orion.c 
b/drivers/pinctrl/mvebu/pinctrl-orion.c
index 84e1441..559ac32 100644
--- a/drivers/pinctrl/mvebu/pinctrl-orion.c
+++ b/drivers/pinctrl/mvebu/pinctrl-orion.c
@@ -225,6 +225,10 @@ static int orion_pinctrl_probe(struct platform_device 
*pdev)
of_match_device(orion_pinctrl_of_match, >dev);
struct resource *res;
 
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void*)match->data;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-- 
1.7.9.5



[PATCH] Pinctrl: mvebu - Fix possible NULL derefrence.

2017-01-29 Thread Shailendra Verma
of_match_device could return NULL, and so can cause a NULL
pointer dereference later.

Signed-off-by: Shailendra Verma 
---
 drivers/pinctrl/mvebu/pinctrl-dove.c |4 
 drivers/pinctrl/mvebu/pinctrl-kirkwood.c |4 
 drivers/pinctrl/mvebu/pinctrl-orion.c|4 
 3 files changed, 12 insertions(+)

diff --git a/drivers/pinctrl/mvebu/pinctrl-dove.c 
b/drivers/pinctrl/mvebu/pinctrl-dove.c
index f93ae0d..f9fe6d4 100644
--- a/drivers/pinctrl/mvebu/pinctrl-dove.c
+++ b/drivers/pinctrl/mvebu/pinctrl-dove.c
@@ -769,6 +769,10 @@ static int dove_pinctrl_probe(struct platform_device *pdev)
struct resource fb_res;
const struct of_device_id *match =
of_match_device(dove_pinctrl_of_match, >dev);
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void *)match->data;
 
/*
diff --git a/drivers/pinctrl/mvebu/pinctrl-kirkwood.c 
b/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
index 5f89c26..75efb83 100644
--- a/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
+++ b/drivers/pinctrl/mvebu/pinctrl-kirkwood.c
@@ -472,6 +472,10 @@ static int kirkwood_pinctrl_probe(struct platform_device 
*pdev)
struct resource *res;
const struct of_device_id *match =
of_match_device(kirkwood_pinctrl_of_match, >dev);
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void *)match->data;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
diff --git a/drivers/pinctrl/mvebu/pinctrl-orion.c 
b/drivers/pinctrl/mvebu/pinctrl-orion.c
index 84e1441..559ac32 100644
--- a/drivers/pinctrl/mvebu/pinctrl-orion.c
+++ b/drivers/pinctrl/mvebu/pinctrl-orion.c
@@ -225,6 +225,10 @@ static int orion_pinctrl_probe(struct platform_device 
*pdev)
of_match_device(orion_pinctrl_of_match, >dev);
struct resource *res;
 
+   if (!match) {
+   dev_err(>dev, "Error: No device match found\n");
+   return -ENODEV;
+   }
pdev->dev.platform_data = (void*)match->data;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-- 
1.7.9.5



Re: [PATCH v7 00/20] ILP32 for ARM64

2017-01-29 Thread Yury Norov
On Mon, Jan 09, 2017 at 04:59:37PM +0530, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode.
> 
> As supporting work, it introduces ARCH_32BIT_OFF_T configuration
> option that is enabled for existing 32-bit architectures but disabled
> for new arches (so 64-bit off_t is is used by new userspace). Also it
> deprecates getrlimit and setrlimit syscalls prior to prlimit64.
> 
> This version is based on linux-next from 2017-01-09. It works with
> glibc-2.25, and tested with LTP, glibc testsuite, trinity, lmbench,
> CPUSpec.
> 
> This is not RFC anymore. I believe that all ABI and implementation
> issues are resolved now. The way that kernel clears registers top
> halves is probably the last question, and because there's no objection
> for current approach for more that 6 month, I think, community agrees
> with it.
> 
> Patches 1, 2, 3 and 8 are general, and may be applied separately.
> 
> Current version does not introduce ABI changes comparing to RFC3.
> Kernel and GLIBC trees:
> https://github.com/norov/linux/tree/ilp32-2017-01-09
> https://github.com/norov/glibc/tree/dev9

Hi Arnd, Catalin,

Is there any progress with review? I particularly ask about generic
patches (1-3, 8). They all reviewed, and most of them acked by you,
but they are still not upstreamed.

Yury


Re: [PATCH v7 00/20] ILP32 for ARM64

2017-01-29 Thread Yury Norov
On Mon, Jan 09, 2017 at 04:59:37PM +0530, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode.
> 
> As supporting work, it introduces ARCH_32BIT_OFF_T configuration
> option that is enabled for existing 32-bit architectures but disabled
> for new arches (so 64-bit off_t is is used by new userspace). Also it
> deprecates getrlimit and setrlimit syscalls prior to prlimit64.
> 
> This version is based on linux-next from 2017-01-09. It works with
> glibc-2.25, and tested with LTP, glibc testsuite, trinity, lmbench,
> CPUSpec.
> 
> This is not RFC anymore. I believe that all ABI and implementation
> issues are resolved now. The way that kernel clears registers top
> halves is probably the last question, and because there's no objection
> for current approach for more that 6 month, I think, community agrees
> with it.
> 
> Patches 1, 2, 3 and 8 are general, and may be applied separately.
> 
> Current version does not introduce ABI changes comparing to RFC3.
> Kernel and GLIBC trees:
> https://github.com/norov/linux/tree/ilp32-2017-01-09
> https://github.com/norov/glibc/tree/dev9

Hi Arnd, Catalin,

Is there any progress with review? I particularly ask about generic
patches (1-3, 8). They all reviewed, and most of them acked by you,
but they are still not upstreamed.

Yury


[RFC 1/6] mm: add wrap for page accouting index

2017-01-29 Thread Shaohua Li
We calculate page/lru accouting index with checking if the page/lru is
file. This will be a problem when we introduce a new LRU list. So add a
wrap for the calculation.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 include/linux/mm_inline.h | 26 ++
 include/trace/events/vmscan.h | 23 ---
 mm/compaction.c   |  3 +--
 mm/khugepaged.c   |  6 ++
 mm/memory-failure.c   |  3 +--
 mm/memory_hotplug.c   |  3 +--
 mm/mempolicy.c|  3 +--
 mm/migrate.c  | 27 +--
 mm/vmscan.c   | 19 ++-
 9 files changed, 63 insertions(+), 50 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index e030a68..0dddc2c 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -124,6 +124,32 @@ static __always_inline enum lru_list page_lru(struct page 
*page)
return lru;
 }
 
+/*
+ * lru_isolate_index - which item should a lru be accounted for
+ * @lru: the lru list
+ *
+ * Returns the accounting item index of the lru
+ */
+static inline int lru_isolate_index(enum lru_list lru)
+{
+   if (lru == LRU_INACTIVE_FILE || lru == LRU_ACTIVE_FILE)
+   return NR_ISOLATED_FILE;
+   return NR_ISOLATED_ANON;
+}
+
+/*
+ * page_isolate_index - which item should a page be accounted for
+ * @page: the page to test
+ *
+ * Returns the accounting item index of the page
+ */
+static inline int page_isolate_index(struct page *page)
+{
+   if (!PageSwapBacked(page))
+   return NR_ISOLATED_FILE;
+   return NR_ISOLATED_ANON;
+}
+
 #define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
 
 #endif
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 27e8a5c..fab386d 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -31,9 +31,10 @@
(RECLAIM_WB_ASYNC) \
)
 
-#define trace_shrink_flags(file) \
+#define trace_shrink_flags(isolate_index) \
( \
-   (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
+   (isolate_index == NR_ISOLATED_FILE ? RECLAIM_WB_FILE : \
+   RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
)
 
@@ -345,11 +346,11 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
unsigned long nr_congested, unsigned long nr_immediate,
unsigned long nr_activate, unsigned long nr_ref_keep,
unsigned long nr_unmap_fail,
-   int priority, int file),
+   int priority, int isolate_index),
 
TP_ARGS(nid, nr_scanned, nr_reclaimed, nr_dirty, nr_writeback,
nr_congested, nr_immediate, nr_activate, nr_ref_keep,
-   nr_unmap_fail, priority, file),
+   nr_unmap_fail, priority, isolate_index),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -378,7 +379,7 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
__entry->nr_ref_keep = nr_ref_keep;
__entry->nr_unmap_fail = nr_unmap_fail;
__entry->priority = priority;
-   __entry->reclaim_flags = trace_shrink_flags(file);
+   __entry->reclaim_flags = trace_shrink_flags(isolate_index);
),
 
TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld nr_dirty=%ld 
nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate=%ld 
nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s",
@@ -395,9 +396,9 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
 
TP_PROTO(int nid, unsigned long nr_taken,
unsigned long nr_active, unsigned long nr_deactivated,
-   unsigned long nr_referenced, int priority, int file),
+   unsigned long nr_referenced, int priority, int isolate_index),
 
-   TP_ARGS(nid, nr_taken, nr_active, nr_deactivated, nr_referenced, 
priority, file),
+   TP_ARGS(nid, nr_taken, nr_active, nr_deactivated, nr_referenced, 
priority, isolate_index),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -416,7 +417,7 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
__entry->nr_deactivated = nr_deactivated;
__entry->nr_referenced = nr_referenced;
__entry->priority = priority;
-   __entry->reclaim_flags = trace_shrink_flags(file);
+   __entry->reclaim_flags = trace_shrink_flags(isolate_index);
),
 
TP_printk("nid=%d nr_taken=%ld nr_active=%ld nr_deactivated=%ld 
nr_referenced=%ld priority=%d flags=%s",
@@ -432,9 +433,9 @@ TRACE_EVENT(mm_vmscan_inactive_list_is_low,
TP_PROTO(int nid, int reclaim_idx,

[RFC 1/6] mm: add wrap for page accouting index

2017-01-29 Thread Shaohua Li
We calculate page/lru accouting index with checking if the page/lru is
file. This will be a problem when we introduce a new LRU list. So add a
wrap for the calculation.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 include/linux/mm_inline.h | 26 ++
 include/trace/events/vmscan.h | 23 ---
 mm/compaction.c   |  3 +--
 mm/khugepaged.c   |  6 ++
 mm/memory-failure.c   |  3 +--
 mm/memory_hotplug.c   |  3 +--
 mm/mempolicy.c|  3 +--
 mm/migrate.c  | 27 +--
 mm/vmscan.c   | 19 ++-
 9 files changed, 63 insertions(+), 50 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index e030a68..0dddc2c 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -124,6 +124,32 @@ static __always_inline enum lru_list page_lru(struct page 
*page)
return lru;
 }
 
+/*
+ * lru_isolate_index - which item should a lru be accounted for
+ * @lru: the lru list
+ *
+ * Returns the accounting item index of the lru
+ */
+static inline int lru_isolate_index(enum lru_list lru)
+{
+   if (lru == LRU_INACTIVE_FILE || lru == LRU_ACTIVE_FILE)
+   return NR_ISOLATED_FILE;
+   return NR_ISOLATED_ANON;
+}
+
+/*
+ * page_isolate_index - which item should a page be accounted for
+ * @page: the page to test
+ *
+ * Returns the accounting item index of the page
+ */
+static inline int page_isolate_index(struct page *page)
+{
+   if (!PageSwapBacked(page))
+   return NR_ISOLATED_FILE;
+   return NR_ISOLATED_ANON;
+}
+
 #define lru_to_page(head) (list_entry((head)->prev, struct page, lru))
 
 #endif
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 27e8a5c..fab386d 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -31,9 +31,10 @@
(RECLAIM_WB_ASYNC) \
)
 
-#define trace_shrink_flags(file) \
+#define trace_shrink_flags(isolate_index) \
( \
-   (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
+   (isolate_index == NR_ISOLATED_FILE ? RECLAIM_WB_FILE : \
+   RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
)
 
@@ -345,11 +346,11 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
unsigned long nr_congested, unsigned long nr_immediate,
unsigned long nr_activate, unsigned long nr_ref_keep,
unsigned long nr_unmap_fail,
-   int priority, int file),
+   int priority, int isolate_index),
 
TP_ARGS(nid, nr_scanned, nr_reclaimed, nr_dirty, nr_writeback,
nr_congested, nr_immediate, nr_activate, nr_ref_keep,
-   nr_unmap_fail, priority, file),
+   nr_unmap_fail, priority, isolate_index),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -378,7 +379,7 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
__entry->nr_ref_keep = nr_ref_keep;
__entry->nr_unmap_fail = nr_unmap_fail;
__entry->priority = priority;
-   __entry->reclaim_flags = trace_shrink_flags(file);
+   __entry->reclaim_flags = trace_shrink_flags(isolate_index);
),
 
TP_printk("nid=%d nr_scanned=%ld nr_reclaimed=%ld nr_dirty=%ld 
nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate=%ld 
nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s",
@@ -395,9 +396,9 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
 
TP_PROTO(int nid, unsigned long nr_taken,
unsigned long nr_active, unsigned long nr_deactivated,
-   unsigned long nr_referenced, int priority, int file),
+   unsigned long nr_referenced, int priority, int isolate_index),
 
-   TP_ARGS(nid, nr_taken, nr_active, nr_deactivated, nr_referenced, 
priority, file),
+   TP_ARGS(nid, nr_taken, nr_active, nr_deactivated, nr_referenced, 
priority, isolate_index),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -416,7 +417,7 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active,
__entry->nr_deactivated = nr_deactivated;
__entry->nr_referenced = nr_referenced;
__entry->priority = priority;
-   __entry->reclaim_flags = trace_shrink_flags(file);
+   __entry->reclaim_flags = trace_shrink_flags(isolate_index);
),
 
TP_printk("nid=%d nr_taken=%ld nr_active=%ld nr_deactivated=%ld 
nr_referenced=%ld priority=%d flags=%s",
@@ -432,9 +433,9 @@ TRACE_EVENT(mm_vmscan_inactive_list_is_low,
TP_PROTO(int nid, int reclaim_idx,
unsigned long total_inactive, unsigned long inactive,
unsigned long total_active, unsigned long 

[RFC 4/6] mm: move MADV_FREE pages into LRU_LAZYFREE list

2017-01-29 Thread Shaohua Li
Move the MADV_FREE pages into LRU_LAZYFREE list. The reason why we need
to do this is described in last patch. Next patch will reclaim the
pages.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 include/linux/swap.h |  2 +-
 mm/huge_memory.c |  5 ++---
 mm/madvise.c |  3 +--
 mm/swap.c| 51 +--
 4 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 45e91dd..e35bef5 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -279,7 +279,7 @@ extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
-extern void deactivate_page(struct page *page);
+extern void move_page_to_lazyfree_list(struct page *page);
 extern void swap_setup(void);
 
 extern void add_page_to_unevictable_list(struct page *page);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ffa7ed5..57daef7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1391,9 +1391,6 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
ClearPageDirty(page);
unlock_page(page);
 
-   if (PageActive(page))
-   deactivate_page(page);
-
if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
tlb->fullmm);
@@ -1404,6 +1401,8 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
set_pmd_at(mm, addr, pmd, orig_pmd);
tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
}
+
+   move_page_to_lazyfree_list(page);
ret = true;
 out:
spin_unlock(ptl);
diff --git a/mm/madvise.c b/mm/madvise.c
index c867d88..78b4b02 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -378,10 +378,9 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
ptent = pte_mkclean(ptent);
ptent = pte_wrprotect(ptent);
set_pte_at(mm, addr, pte, ptent);
-   if (PageActive(page))
-   deactivate_page(page);
tlb_remove_tlb_entry(tlb, pte, addr);
}
+   move_page_to_lazyfree_list(page);
}
 out:
if (nr_swap) {
diff --git a/mm/swap.c b/mm/swap.c
index c4910f1..f9e70e8 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -46,7 +46,7 @@ int page_cluster;
 static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
+static DEFINE_PER_CPU(struct pagevec, lru_lazyfree_pvecs);
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
 #endif
@@ -268,6 +268,10 @@ static void __activate_page(struct page *page, struct 
lruvec *lruvec,
int lru = page_lru_base_type(page);
 
del_page_from_lru_list(page, lruvec, lru);
+   if (lru == LRU_LAZYFREE) {
+   ClearPageLazyFree(page);
+   lru = LRU_INACTIVE_ANON;
+   }
SetPageActive(page);
lru += LRU_ACTIVE;
add_page_to_lru_list(page, lruvec, lru);
@@ -455,6 +459,8 @@ void add_page_to_unevictable_list(struct page *page)
ClearPageActive(page);
SetPageUnevictable(page);
SetPageLRU(page);
+   if (page_is_lazyfree(page))
+   ClearPageLazyFree(page);
add_page_to_lru_list(page, lruvec, LRU_UNEVICTABLE);
spin_unlock_irq(>lru_lock);
 }
@@ -561,20 +567,21 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec,
 }
 
 
-static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
+static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
void *arg)
 {
-   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
-   int file = page_is_file_cache(page);
-   int lru = page_lru_base_type(page);
+   if (PageLRU(page) && PageSwapBacked(page) && !PageLazyFree(page) &&
+   !PageUnevictable(page)) {
+   unsigned int nr_pages = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
+   bool active = PageActive(page);
 
-   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+   del_page_from_lru_list(page, lruvec, LRU_INACTIVE_ANON + 
active);

[RFC 6/6] mm: enable MADV_FREE for swapless system

2017-01-29 Thread Shaohua Li
Now MADV_FREE pages can be easily reclaimed even for swapless system. We
can safely enable MADV_FREE for all systems.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 mm/madvise.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 78b4b02..047cfd4 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -579,13 +579,7 @@ madvise_vma(struct vm_area_struct *vma, struct 
vm_area_struct **prev,
case MADV_WILLNEED:
return madvise_willneed(vma, prev, start, end);
case MADV_FREE:
-   /*
-* XXX: In this implementation, MADV_FREE works like
-* MADV_DONTNEED on swapless system or full swap.
-*/
-   if (get_nr_swap_pages() > 0)
-   return madvise_free(vma, prev, start, end);
-   /* passthrough */
+   return madvise_free(vma, prev, start, end);
case MADV_DONTNEED:
return madvise_dontneed(vma, prev, start, end);
default:
-- 
2.9.3



[RFC 4/6] mm: move MADV_FREE pages into LRU_LAZYFREE list

2017-01-29 Thread Shaohua Li
Move the MADV_FREE pages into LRU_LAZYFREE list. The reason why we need
to do this is described in last patch. Next patch will reclaim the
pages.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 include/linux/swap.h |  2 +-
 mm/huge_memory.c |  5 ++---
 mm/madvise.c |  3 +--
 mm/swap.c| 51 +--
 4 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 45e91dd..e35bef5 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -279,7 +279,7 @@ extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
 extern void rotate_reclaimable_page(struct page *page);
 extern void deactivate_file_page(struct page *page);
-extern void deactivate_page(struct page *page);
+extern void move_page_to_lazyfree_list(struct page *page);
 extern void swap_setup(void);
 
 extern void add_page_to_unevictable_list(struct page *page);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ffa7ed5..57daef7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1391,9 +1391,6 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
ClearPageDirty(page);
unlock_page(page);
 
-   if (PageActive(page))
-   deactivate_page(page);
-
if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
tlb->fullmm);
@@ -1404,6 +1401,8 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct 
vm_area_struct *vma,
set_pmd_at(mm, addr, pmd, orig_pmd);
tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
}
+
+   move_page_to_lazyfree_list(page);
ret = true;
 out:
spin_unlock(ptl);
diff --git a/mm/madvise.c b/mm/madvise.c
index c867d88..78b4b02 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -378,10 +378,9 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
ptent = pte_mkclean(ptent);
ptent = pte_wrprotect(ptent);
set_pte_at(mm, addr, pte, ptent);
-   if (PageActive(page))
-   deactivate_page(page);
tlb_remove_tlb_entry(tlb, pte, addr);
}
+   move_page_to_lazyfree_list(page);
}
 out:
if (nr_swap) {
diff --git a/mm/swap.c b/mm/swap.c
index c4910f1..f9e70e8 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -46,7 +46,7 @@ int page_cluster;
 static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
 static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
+static DEFINE_PER_CPU(struct pagevec, lru_lazyfree_pvecs);
 #ifdef CONFIG_SMP
 static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
 #endif
@@ -268,6 +268,10 @@ static void __activate_page(struct page *page, struct 
lruvec *lruvec,
int lru = page_lru_base_type(page);
 
del_page_from_lru_list(page, lruvec, lru);
+   if (lru == LRU_LAZYFREE) {
+   ClearPageLazyFree(page);
+   lru = LRU_INACTIVE_ANON;
+   }
SetPageActive(page);
lru += LRU_ACTIVE;
add_page_to_lru_list(page, lruvec, lru);
@@ -455,6 +459,8 @@ void add_page_to_unevictable_list(struct page *page)
ClearPageActive(page);
SetPageUnevictable(page);
SetPageLRU(page);
+   if (page_is_lazyfree(page))
+   ClearPageLazyFree(page);
add_page_to_lru_list(page, lruvec, LRU_UNEVICTABLE);
spin_unlock_irq(>lru_lock);
 }
@@ -561,20 +567,21 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec,
 }
 
 
-static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
+static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec,
void *arg)
 {
-   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
-   int file = page_is_file_cache(page);
-   int lru = page_lru_base_type(page);
+   if (PageLRU(page) && PageSwapBacked(page) && !PageLazyFree(page) &&
+   !PageUnevictable(page)) {
+   unsigned int nr_pages = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
+   bool active = PageActive(page);
 
-   del_page_from_lru_list(page, lruvec, lru + LRU_ACTIVE);
+   del_page_from_lru_list(page, lruvec, LRU_INACTIVE_ANON + 
active);
ClearPageActive(page);
ClearPageReferenced(page);
-   add_page_to_lru_list(page, lruvec, lru);
+   

[RFC 6/6] mm: enable MADV_FREE for swapless system

2017-01-29 Thread Shaohua Li
Now MADV_FREE pages can be easily reclaimed even for swapless system. We
can safely enable MADV_FREE for all systems.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 mm/madvise.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 78b4b02..047cfd4 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -579,13 +579,7 @@ madvise_vma(struct vm_area_struct *vma, struct 
vm_area_struct **prev,
case MADV_WILLNEED:
return madvise_willneed(vma, prev, start, end);
case MADV_FREE:
-   /*
-* XXX: In this implementation, MADV_FREE works like
-* MADV_DONTNEED on swapless system or full swap.
-*/
-   if (get_nr_swap_pages() > 0)
-   return madvise_free(vma, prev, start, end);
-   /* passthrough */
+   return madvise_free(vma, prev, start, end);
case MADV_DONTNEED:
return madvise_dontneed(vma, prev, start, end);
default:
-- 
2.9.3



[RFC 5/6] mm: reclaim lazyfree pages

2017-01-29 Thread Shaohua Li
When memory pressure is high, we must free lazyfree pages. If we free
lazyfree pages, the cost reaccessing the pages is a page fault and page
allocation. The cost is much lower than swapin a page or refill a file
page cache because refilling anon/file page includes the same cost plus
extra IO cost, which is very high.

The policy to determine when to free lazyfree pages is controversial.
Some think lazyfree pages should be reclaimed first before any other
anon/file pages, because userspace already indicates the pages are not
important at all and the cost to refill lazyfree pages is much lower
than refilling anon/file page cache. Others think userspace could still
use the MADV_FREE pages otherwise userspace will directly use
MADV_DISCARD to free the pages. If page cache won't be used again, there
is no refill cost for page cache and thus in this case reclaiming
MADV_FREE pages doesn't make sense because refill MADV_FREE pages still
has cost.

This patch doesn't choose the latter. It's possible released page cache
never gets refilled, but the opposite case could happen very likely too.
Considering the refill cost of file/anon pages is much higher than
refill cost of MADV_FREE pages, it doesn't make sense to retain lazyfree
pages.

For the implementation, this is targeted for swapless system, so we
don't allocate a swap entry for lazyfree pages. If the pages can't be
reclaimed directly, they are put back into anon lru list and reclaimed
in normal way.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 mm/rmap.c   |  7 ++-
 mm/vmscan.c | 56 
 2 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index c48e9c1..f9b1023 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1546,13 +1546,18 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 * Store the swap location in the pte.
 * See handle_pte_fault() ...
 */
-   VM_BUG_ON_PAGE(!PageSwapCache(page), page);
+   VM_BUG_ON_PAGE(!PageSwapCache(page) && !PageLazyFree(page),
+   page);
 
if (!PageDirty(page) && (flags & TTU_LZFREE)) {
/* It's a freeable page by MADV_FREE */
dec_mm_counter(mm, MM_ANONPAGES);
rp->lazyfreed++;
goto discard;
+   } else if (flags & TTU_LZFREE) {
+   set_pte_at(mm, address, pte, pteval);
+   ret = SWAP_FAIL;
+   goto out_unmap;
}
 
if (swap_duplicate(entry) < 0) {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3a0d05b..f809f04 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -974,7 +974,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
int may_enter_fs;
enum page_references references = PAGEREF_RECLAIM_CLEAN;
bool dirty, writeback;
-   bool lazyfree = false;
+   bool lazyfree;
int ret = SWAP_SUCCESS;
 
cond_resched();
@@ -989,6 +989,8 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
 
sc->nr_scanned++;
 
+   lazyfree = page_is_lazyfree(page);
+
if (unlikely(!page_evictable(page)))
goto cull_mlocked;
 
@@ -996,7 +998,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
 
/* Double the slab pressure for mapped and swapcache pages */
-   if (page_mapped(page) || PageSwapCache(page))
+   if ((page_mapped(page) || PageSwapCache(page)) && !lazyfree)
sc->nr_scanned++;
 
may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
@@ -1110,6 +1112,14 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
; /* try to reclaim the page below */
}
 
+   /* lazyfree page could be freed directly */
+   if (lazyfree) {
+   if (unlikely(PageTransHuge(page)) &&
+   split_huge_page_to_list(page, page_list))
+   goto keep_locked;
+   goto unmap_page;
+   }
+
/*
 * Anonymous process memory has backing store?
 * Try to allocate it some swap space here.
@@ -1119,7 +1129,6 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
if (!add_to_swap(page, page_list))
 

[RFC 3/6] mm: add LRU_LAZYFREE lru list

2017-01-29 Thread Shaohua Li
MADV_FREE pages are in anonymous LRU list currently, there are several problems:
- Doesn't support system without swap enabled. Because if swap is off,
  we can't or can't efficiently age anonymous pages. And since MADV_FREE
  pages are mixed with other anonymous pages, we can't reclaim MADV_FREE pages
- Increases memory pressure. page reclaim bias file pages reclaim
  against anonymous pages. This doesn't make sense for MADV_FREE pages,
  because those pages could be freed easily with very slight penality.
  Even page reclaim doesn't bias file pages, there is still an issue,
  because MADV_FREE pages and other anonymous pages are mixed together.
  To reclaim a MADV_FREE page, we probably must scan a lot of other
  anonymous pages, which is inefficient.

Introducing a new LRU list for MADV_FREE pages could solve the issues.
If only MADV_FREE pages are in the new list, page reclaim can easily
reclaim such pages without interference of file or anonymous pages.

This patch adds a LRU_LAZYFREE lru list. It's a dedicated LRU list for
MADV_FREE pages.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 drivers/base/node.c   |  2 ++
 drivers/staging/android/lowmemorykiller.c |  3 ++-
 fs/proc/meminfo.c |  1 +
 include/linux/mm_inline.h | 10 ++
 include/linux/mmzone.h|  9 +
 include/linux/vm_event_item.h |  2 +-
 include/trace/events/mmflags.h|  1 +
 include/trace/events/vmscan.h | 10 +++---
 kernel/power/snapshot.c   |  1 +
 mm/compaction.c   |  8 +---
 mm/memcontrol.c   |  4 
 mm/page_alloc.c   | 10 ++
 mm/vmscan.c   | 21 ++---
 mm/vmstat.c   |  4 
 14 files changed, 71 insertions(+), 15 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5548f96..5c09b67 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -70,6 +70,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   "Node %d Inactive(anon): %8lu kB\n"
   "Node %d Active(file):   %8lu kB\n"
   "Node %d Inactive(file): %8lu kB\n"
+  "Node %d LazyFree:   %8lu kB\n"
   "Node %d Unevictable:%8lu kB\n"
   "Node %d Mlocked:%8lu kB\n",
   nid, K(i.totalram),
@@ -83,6 +84,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   nid, K(node_page_state(pgdat, NR_INACTIVE_ANON)),
   nid, K(node_page_state(pgdat, NR_ACTIVE_FILE)),
   nid, K(node_page_state(pgdat, NR_INACTIVE_FILE)),
+  nid, K(node_page_state(pgdat, NR_LAZYFREE)),
   nid, K(node_page_state(pgdat, NR_UNEVICTABLE)),
   nid, K(sum_zone_node_page_state(nid, NR_MLOCK)));
 
diff --git a/drivers/staging/android/lowmemorykiller.c 
b/drivers/staging/android/lowmemorykiller.c
index ec3b665..2648872 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -75,7 +75,8 @@ static unsigned long lowmem_count(struct shrinker *s,
return global_node_page_state(NR_ACTIVE_ANON) +
global_node_page_state(NR_ACTIVE_FILE) +
global_node_page_state(NR_INACTIVE_ANON) +
-   global_node_page_state(NR_INACTIVE_FILE);
+   global_node_page_state(NR_INACTIVE_FILE) +
+   global_node_page_state(NR_LAZYFREE);
 }
 
 static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8a42849..7803d33 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -79,6 +79,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
show_val_kb(m, "Inactive(anon): ", pages[LRU_INACTIVE_ANON]);
show_val_kb(m, "Active(file):   ", pages[LRU_ACTIVE_FILE]);
show_val_kb(m, "Inactive(file): ", pages[LRU_INACTIVE_FILE]);
+   show_val_kb(m, "LazyFree:   ", pages[LRU_LAZYFREE]);
show_val_kb(m, "Unevictable:", pages[LRU_UNEVICTABLE]);
show_val_kb(m, "Mlocked:", global_page_state(NR_MLOCK));
 
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 828e813..5f22c93 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -81,6 +81,8 @@ static inline enum lru_list page_lru_base_type(struct page 
*page)
 {
if (page_is_file_cache(page))
return LRU_INACTIVE_FILE;
+   

[RFC 2/6] mm: add lazyfree page flag

2017-01-29 Thread Shaohua Li
We are going to add MADV_FREE pages into a new LRU list. Add a new flag
to indicate such pages. Note, we are reusing PG_mappedtodisk for the new
flag. This is ok because no anonymous pages have this flag set.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li
---
 fs/proc/task_mmu.c | 8 +++-
 include/linux/mm_inline.h  | 5 +
 include/linux/page-flags.h | 6 ++
 mm/huge_memory.c   | 1 +
 mm/migrate.c   | 2 ++
 5 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ee3efb2..813d3aa 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -440,6 +440,7 @@ struct mem_size_stats {
unsigned long private_dirty;
unsigned long referenced;
unsigned long anonymous;
+   unsigned long lazyfree;
unsigned long anonymous_thp;
unsigned long shmem_thp;
unsigned long swap;
@@ -456,8 +457,11 @@ static void smaps_account(struct mem_size_stats *mss, 
struct page *page,
int i, nr = compound ? 1 << compound_order(page) : 1;
unsigned long size = nr * PAGE_SIZE;
 
-   if (PageAnon(page))
+   if (PageAnon(page)) {
mss->anonymous += size;
+   if (PageLazyFree(page))
+   mss->lazyfree += size;
+   }
 
mss->resident += size;
/* Accumulate the size in pages that have been accessed. */
@@ -770,6 +774,7 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
   "Private_Dirty:  %8lu kB\n"
   "Referenced: %8lu kB\n"
   "Anonymous:  %8lu kB\n"
+  "LazyFree:   %8lu kB\n"
   "AnonHugePages:  %8lu kB\n"
   "ShmemPmdMapped: %8lu kB\n"
   "Shared_Hugetlb: %8lu kB\n"
@@ -788,6 +793,7 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
   mss.private_dirty >> 10,
   mss.referenced >> 10,
   mss.anonymous >> 10,
+  mss.lazyfree >> 10,
   mss.anonymous_thp >> 10,
   mss.shmem_thp >> 10,
   mss.shared_hugetlb >> 10,
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 0dddc2c..828e813 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -22,6 +22,11 @@ static inline int page_is_file_cache(struct page *page)
return !PageSwapBacked(page);
 }
 
+static inline bool page_is_lazyfree(struct page *page)
+{
+   return PageSwapBacked(page) && PageLazyFree(page);
+}
+
 static __always_inline void __update_lru_size(struct lruvec *lruvec,
enum lru_list lru, enum zone_type zid,
int nr_pages)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6b5818d..e8ea378 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -107,6 +107,9 @@ enum pageflags {
 #endif
__NR_PAGEFLAGS,
 
+   /* MADV_FREE */
+   PG_lazyfree = PG_mappedtodisk,
+
/* Filesystems */
PG_checked = PG_owner_priv_1,
 
@@ -428,6 +431,9 @@ TESTPAGEFLAG_FALSE(Ksm)
 
 u64 stable_page_flags(struct page *page);
 
+PAGEFLAG(LazyFree, lazyfree, PF_ANY)
+   __CLEARPAGEFLAG(LazyFree, lazyfree, PF_ANY)
+
 static inline int PageUptodate(struct page *page)
 {
int ret;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40bd376..ffa7ed5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1918,6 +1918,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 (1L << PG_swapbacked) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
+(1L << PG_lazyfree) |
 (1L << PG_active) |
 (1L << PG_locked) |
 (1L << PG_unevictable) |
diff --git a/mm/migrate.c b/mm/migrate.c
index 502ebea..496105c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -641,6 +641,8 @@ void migrate_page_copy(struct page *newpage, struct page 
*page)
SetPageChecked(newpage);
if (PageMappedToDisk(page))
SetPageMappedToDisk(newpage);
+   if (PageLazyFree(page))
+   SetPageLazyFree(newpage);
 
/* Move dirty on pages not done by migrate_page_move_mapping() */
if (PageDirty(page))
-- 
2.9.3



[RFC 5/6] mm: reclaim lazyfree pages

2017-01-29 Thread Shaohua Li
When memory pressure is high, we must free lazyfree pages. If we free
lazyfree pages, the cost reaccessing the pages is a page fault and page
allocation. The cost is much lower than swapin a page or refill a file
page cache because refilling anon/file page includes the same cost plus
extra IO cost, which is very high.

The policy to determine when to free lazyfree pages is controversial.
Some think lazyfree pages should be reclaimed first before any other
anon/file pages, because userspace already indicates the pages are not
important at all and the cost to refill lazyfree pages is much lower
than refilling anon/file page cache. Others think userspace could still
use the MADV_FREE pages otherwise userspace will directly use
MADV_DISCARD to free the pages. If page cache won't be used again, there
is no refill cost for page cache and thus in this case reclaiming
MADV_FREE pages doesn't make sense because refill MADV_FREE pages still
has cost.

This patch doesn't choose the latter. It's possible released page cache
never gets refilled, but the opposite case could happen very likely too.
Considering the refill cost of file/anon pages is much higher than
refill cost of MADV_FREE pages, it doesn't make sense to retain lazyfree
pages.

For the implementation, this is targeted for swapless system, so we
don't allocate a swap entry for lazyfree pages. If the pages can't be
reclaimed directly, they are put back into anon lru list and reclaimed
in normal way.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 mm/rmap.c   |  7 ++-
 mm/vmscan.c | 56 
 2 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index c48e9c1..f9b1023 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1546,13 +1546,18 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 * Store the swap location in the pte.
 * See handle_pte_fault() ...
 */
-   VM_BUG_ON_PAGE(!PageSwapCache(page), page);
+   VM_BUG_ON_PAGE(!PageSwapCache(page) && !PageLazyFree(page),
+   page);
 
if (!PageDirty(page) && (flags & TTU_LZFREE)) {
/* It's a freeable page by MADV_FREE */
dec_mm_counter(mm, MM_ANONPAGES);
rp->lazyfreed++;
goto discard;
+   } else if (flags & TTU_LZFREE) {
+   set_pte_at(mm, address, pte, pteval);
+   ret = SWAP_FAIL;
+   goto out_unmap;
}
 
if (swap_duplicate(entry) < 0) {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3a0d05b..f809f04 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -974,7 +974,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
int may_enter_fs;
enum page_references references = PAGEREF_RECLAIM_CLEAN;
bool dirty, writeback;
-   bool lazyfree = false;
+   bool lazyfree;
int ret = SWAP_SUCCESS;
 
cond_resched();
@@ -989,6 +989,8 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
 
sc->nr_scanned++;
 
+   lazyfree = page_is_lazyfree(page);
+
if (unlikely(!page_evictable(page)))
goto cull_mlocked;
 
@@ -996,7 +998,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
 
/* Double the slab pressure for mapped and swapcache pages */
-   if (page_mapped(page) || PageSwapCache(page))
+   if ((page_mapped(page) || PageSwapCache(page)) && !lazyfree)
sc->nr_scanned++;
 
may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
@@ -1110,6 +1112,14 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
; /* try to reclaim the page below */
}
 
+   /* lazyfree page could be freed directly */
+   if (lazyfree) {
+   if (unlikely(PageTransHuge(page)) &&
+   split_huge_page_to_list(page, page_list))
+   goto keep_locked;
+   goto unmap_page;
+   }
+
/*
 * Anonymous process memory has backing store?
 * Try to allocate it some swap space here.
@@ -1119,7 +1129,6 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
if (!add_to_swap(page, page_list))
goto activate_locked;
-   lazyfree = true;
may_enter_fs = 1;
 

[RFC 3/6] mm: add LRU_LAZYFREE lru list

2017-01-29 Thread Shaohua Li
MADV_FREE pages are in anonymous LRU list currently, there are several problems:
- Doesn't support system without swap enabled. Because if swap is off,
  we can't or can't efficiently age anonymous pages. And since MADV_FREE
  pages are mixed with other anonymous pages, we can't reclaim MADV_FREE pages
- Increases memory pressure. page reclaim bias file pages reclaim
  against anonymous pages. This doesn't make sense for MADV_FREE pages,
  because those pages could be freed easily with very slight penality.
  Even page reclaim doesn't bias file pages, there is still an issue,
  because MADV_FREE pages and other anonymous pages are mixed together.
  To reclaim a MADV_FREE page, we probably must scan a lot of other
  anonymous pages, which is inefficient.

Introducing a new LRU list for MADV_FREE pages could solve the issues.
If only MADV_FREE pages are in the new list, page reclaim can easily
reclaim such pages without interference of file or anonymous pages.

This patch adds a LRU_LAZYFREE lru list. It's a dedicated LRU list for
MADV_FREE pages.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li 
---
 drivers/base/node.c   |  2 ++
 drivers/staging/android/lowmemorykiller.c |  3 ++-
 fs/proc/meminfo.c |  1 +
 include/linux/mm_inline.h | 10 ++
 include/linux/mmzone.h|  9 +
 include/linux/vm_event_item.h |  2 +-
 include/trace/events/mmflags.h|  1 +
 include/trace/events/vmscan.h | 10 +++---
 kernel/power/snapshot.c   |  1 +
 mm/compaction.c   |  8 +---
 mm/memcontrol.c   |  4 
 mm/page_alloc.c   | 10 ++
 mm/vmscan.c   | 21 ++---
 mm/vmstat.c   |  4 
 14 files changed, 71 insertions(+), 15 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5548f96..5c09b67 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -70,6 +70,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   "Node %d Inactive(anon): %8lu kB\n"
   "Node %d Active(file):   %8lu kB\n"
   "Node %d Inactive(file): %8lu kB\n"
+  "Node %d LazyFree:   %8lu kB\n"
   "Node %d Unevictable:%8lu kB\n"
   "Node %d Mlocked:%8lu kB\n",
   nid, K(i.totalram),
@@ -83,6 +84,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   nid, K(node_page_state(pgdat, NR_INACTIVE_ANON)),
   nid, K(node_page_state(pgdat, NR_ACTIVE_FILE)),
   nid, K(node_page_state(pgdat, NR_INACTIVE_FILE)),
+  nid, K(node_page_state(pgdat, NR_LAZYFREE)),
   nid, K(node_page_state(pgdat, NR_UNEVICTABLE)),
   nid, K(sum_zone_node_page_state(nid, NR_MLOCK)));
 
diff --git a/drivers/staging/android/lowmemorykiller.c 
b/drivers/staging/android/lowmemorykiller.c
index ec3b665..2648872 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -75,7 +75,8 @@ static unsigned long lowmem_count(struct shrinker *s,
return global_node_page_state(NR_ACTIVE_ANON) +
global_node_page_state(NR_ACTIVE_FILE) +
global_node_page_state(NR_INACTIVE_ANON) +
-   global_node_page_state(NR_INACTIVE_FILE);
+   global_node_page_state(NR_INACTIVE_FILE) +
+   global_node_page_state(NR_LAZYFREE);
 }
 
 static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8a42849..7803d33 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -79,6 +79,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
show_val_kb(m, "Inactive(anon): ", pages[LRU_INACTIVE_ANON]);
show_val_kb(m, "Active(file):   ", pages[LRU_ACTIVE_FILE]);
show_val_kb(m, "Inactive(file): ", pages[LRU_INACTIVE_FILE]);
+   show_val_kb(m, "LazyFree:   ", pages[LRU_LAZYFREE]);
show_val_kb(m, "Unevictable:", pages[LRU_UNEVICTABLE]);
show_val_kb(m, "Mlocked:", global_page_state(NR_MLOCK));
 
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 828e813..5f22c93 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -81,6 +81,8 @@ static inline enum lru_list page_lru_base_type(struct page 
*page)
 {
if (page_is_file_cache(page))
return LRU_INACTIVE_FILE;
+   if (PageLazyFree(page))
+   return LRU_LAZYFREE;
return LRU_INACTIVE_ANON;
 }
 
@@ -100,6 +102,8 @@ static 

[RFC 2/6] mm: add lazyfree page flag

2017-01-29 Thread Shaohua Li
We are going to add MADV_FREE pages into a new LRU list. Add a new flag
to indicate such pages. Note, we are reusing PG_mappedtodisk for the new
flag. This is ok because no anonymous pages have this flag set.

The patch is based on Minchan's previous patch.

Cc: Michal Hocko 
Cc: Minchan Kim 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Shaohua Li
---
 fs/proc/task_mmu.c | 8 +++-
 include/linux/mm_inline.h  | 5 +
 include/linux/page-flags.h | 6 ++
 mm/huge_memory.c   | 1 +
 mm/migrate.c   | 2 ++
 5 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ee3efb2..813d3aa 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -440,6 +440,7 @@ struct mem_size_stats {
unsigned long private_dirty;
unsigned long referenced;
unsigned long anonymous;
+   unsigned long lazyfree;
unsigned long anonymous_thp;
unsigned long shmem_thp;
unsigned long swap;
@@ -456,8 +457,11 @@ static void smaps_account(struct mem_size_stats *mss, 
struct page *page,
int i, nr = compound ? 1 << compound_order(page) : 1;
unsigned long size = nr * PAGE_SIZE;
 
-   if (PageAnon(page))
+   if (PageAnon(page)) {
mss->anonymous += size;
+   if (PageLazyFree(page))
+   mss->lazyfree += size;
+   }
 
mss->resident += size;
/* Accumulate the size in pages that have been accessed. */
@@ -770,6 +774,7 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
   "Private_Dirty:  %8lu kB\n"
   "Referenced: %8lu kB\n"
   "Anonymous:  %8lu kB\n"
+  "LazyFree:   %8lu kB\n"
   "AnonHugePages:  %8lu kB\n"
   "ShmemPmdMapped: %8lu kB\n"
   "Shared_Hugetlb: %8lu kB\n"
@@ -788,6 +793,7 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
   mss.private_dirty >> 10,
   mss.referenced >> 10,
   mss.anonymous >> 10,
+  mss.lazyfree >> 10,
   mss.anonymous_thp >> 10,
   mss.shmem_thp >> 10,
   mss.shared_hugetlb >> 10,
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 0dddc2c..828e813 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -22,6 +22,11 @@ static inline int page_is_file_cache(struct page *page)
return !PageSwapBacked(page);
 }
 
+static inline bool page_is_lazyfree(struct page *page)
+{
+   return PageSwapBacked(page) && PageLazyFree(page);
+}
+
 static __always_inline void __update_lru_size(struct lruvec *lruvec,
enum lru_list lru, enum zone_type zid,
int nr_pages)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6b5818d..e8ea378 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -107,6 +107,9 @@ enum pageflags {
 #endif
__NR_PAGEFLAGS,
 
+   /* MADV_FREE */
+   PG_lazyfree = PG_mappedtodisk,
+
/* Filesystems */
PG_checked = PG_owner_priv_1,
 
@@ -428,6 +431,9 @@ TESTPAGEFLAG_FALSE(Ksm)
 
 u64 stable_page_flags(struct page *page);
 
+PAGEFLAG(LazyFree, lazyfree, PF_ANY)
+   __CLEARPAGEFLAG(LazyFree, lazyfree, PF_ANY)
+
 static inline int PageUptodate(struct page *page)
 {
int ret;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 40bd376..ffa7ed5 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1918,6 +1918,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 (1L << PG_swapbacked) |
 (1L << PG_mlocked) |
 (1L << PG_uptodate) |
+(1L << PG_lazyfree) |
 (1L << PG_active) |
 (1L << PG_locked) |
 (1L << PG_unevictable) |
diff --git a/mm/migrate.c b/mm/migrate.c
index 502ebea..496105c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -641,6 +641,8 @@ void migrate_page_copy(struct page *newpage, struct page 
*page)
SetPageChecked(newpage);
if (PageMappedToDisk(page))
SetPageMappedToDisk(newpage);
+   if (PageLazyFree(page))
+   SetPageLazyFree(newpage);
 
/* Move dirty on pages not done by migrate_page_move_mapping() */
if (PageDirty(page))
-- 
2.9.3



[RFC 0/6]mm: add new LRU list for MADV_FREE pages

2017-01-29 Thread Shaohua Li
Hi,

We are trying to use MADV_FREE in jemalloc. Several issues are found. Without
solving the issues, jemalloc can't use the MADV_FREE feature.
- Doesn't support system without swap enabled. Because if swap is off, we can't
  or can't efficiently age anonymous pages. And since MADV_FREE pages are mixed
  with other anonymous pages, we can't reclaim MADV_FREE pages. In current
  implementation, MADV_FREE will fallback to MADV_DONTNEED without swap enabled.
  But in our environment, a lot of machines don't enable swap. This will prevent
  our setup using MADV_FREE.
- Increases memory pressure. page reclaim bias file pages reclaim against
  anonymous pages. This doesn't make sense for MADV_FREE pages, because those
  pages could be freed easily and refilled with very slight penality. Even page
  reclaim doesn't bias file pages, there is still an issue, because MADV_FREE
  pages and other anonymous pages are mixed together. To reclaim a MADV_FREE
  page, we probably must scan a lot of other anonymous pages, which is
  inefficient. In our test, we usually see oom with MADV_FREE enabled and 
nothing
  without it.
- RSS accounting. MADV_FREE pages are accounted as normal anon pages and
  reclaimed lazily, so application's RSS becomes bigger. This confuses our
  workloads. We have monitoring daemon running and if it finds applications' RSS
  becomes abnormal, the daemon will kill the applications even kernel can 
reclaim
  the memory easily. Currently we don't export separate RSS accounting for
  MADV_FREE pages. This will prevent our setup using MADV_FREE too.

For the first two issues, introducing a new LRU list for MADV_FREE pages could
solve the issues. We can directly reclaim MADV_FREE pages without writting them
out to swap, so the first issue could be fixed. If only MADV_FREE pages are in
the new list, page reclaim can easily reclaim such pages without interference
of file or anonymous pages. The memory pressure issue will disappear.

Actually Minchan posted patches to add the LRU list before, but he didn't
pursue. So I picked up them and the patches are based on Minchan's previous
patches. The main difference between my patches and Minchan previous patches is
page reclaim policy. Minchan's patches introduces a knob to balance the reclaim
of MADV_FREE pages and anon/file pages, while the patches always reclaim
MADV_FREE pages first if there are. I described the reason in patch 5.

For the third issue, we can add a separate RSS count for MADV_FREE pages. The
count will be increased in madvise syscall and decreased in page reclaim (eg,
unmap). One issue is activate_page(). A MADV_FREE page can be promoted to
active page there. But there isn't mm_struct context at that place. Iterating
vma there sounds too silly. The patchset don't fix this issue yet. Hopefully
somebody can share a hint how to fix this issue.

Thanks,
Shaohua

Minchan previous patches:
http://marc.info/?l=linux-mm=144800657002763=2

Shaohua Li (6):
  mm: add wrap for page accouting index
  mm: add lazyfree page flag
  mm: add LRU_LAZYFREE lru list
  mm: move MADV_FREE pages into LRU_LAZYFREE list
  mm: reclaim lazyfree pages
  mm: enable MADV_FREE for swapless system

 drivers/base/node.c   |  2 +
 drivers/staging/android/lowmemorykiller.c |  3 +-
 fs/proc/meminfo.c |  1 +
 fs/proc/task_mmu.c|  8 ++-
 include/linux/mm_inline.h | 41 +
 include/linux/mmzone.h|  9 +++
 include/linux/page-flags.h|  6 ++
 include/linux/swap.h  |  2 +-
 include/linux/vm_event_item.h |  2 +-
 include/trace/events/mmflags.h|  1 +
 include/trace/events/vmscan.h | 31 +-
 kernel/power/snapshot.c   |  1 +
 mm/compaction.c   | 11 ++--
 mm/huge_memory.c  |  6 +-
 mm/khugepaged.c   |  6 +-
 mm/madvise.c  | 11 +---
 mm/memcontrol.c   |  4 ++
 mm/memory-failure.c   |  3 +-
 mm/memory_hotplug.c   |  3 +-
 mm/mempolicy.c|  3 +-
 mm/migrate.c  | 29 --
 mm/page_alloc.c   | 10 
 mm/rmap.c |  7 ++-
 mm/swap.c | 51 +---
 mm/vmscan.c   | 96 +++
 mm/vmstat.c   |  4 ++
 26 files changed, 242 insertions(+), 109 deletions(-)

-- 
2.9.3



[RFC 0/6]mm: add new LRU list for MADV_FREE pages

2017-01-29 Thread Shaohua Li
Hi,

We are trying to use MADV_FREE in jemalloc. Several issues are found. Without
solving the issues, jemalloc can't use the MADV_FREE feature.
- Doesn't support system without swap enabled. Because if swap is off, we can't
  or can't efficiently age anonymous pages. And since MADV_FREE pages are mixed
  with other anonymous pages, we can't reclaim MADV_FREE pages. In current
  implementation, MADV_FREE will fallback to MADV_DONTNEED without swap enabled.
  But in our environment, a lot of machines don't enable swap. This will prevent
  our setup using MADV_FREE.
- Increases memory pressure. page reclaim bias file pages reclaim against
  anonymous pages. This doesn't make sense for MADV_FREE pages, because those
  pages could be freed easily and refilled with very slight penality. Even page
  reclaim doesn't bias file pages, there is still an issue, because MADV_FREE
  pages and other anonymous pages are mixed together. To reclaim a MADV_FREE
  page, we probably must scan a lot of other anonymous pages, which is
  inefficient. In our test, we usually see oom with MADV_FREE enabled and 
nothing
  without it.
- RSS accounting. MADV_FREE pages are accounted as normal anon pages and
  reclaimed lazily, so application's RSS becomes bigger. This confuses our
  workloads. We have monitoring daemon running and if it finds applications' RSS
  becomes abnormal, the daemon will kill the applications even kernel can 
reclaim
  the memory easily. Currently we don't export separate RSS accounting for
  MADV_FREE pages. This will prevent our setup using MADV_FREE too.

For the first two issues, introducing a new LRU list for MADV_FREE pages could
solve the issues. We can directly reclaim MADV_FREE pages without writting them
out to swap, so the first issue could be fixed. If only MADV_FREE pages are in
the new list, page reclaim can easily reclaim such pages without interference
of file or anonymous pages. The memory pressure issue will disappear.

Actually Minchan posted patches to add the LRU list before, but he didn't
pursue. So I picked up them and the patches are based on Minchan's previous
patches. The main difference between my patches and Minchan previous patches is
page reclaim policy. Minchan's patches introduces a knob to balance the reclaim
of MADV_FREE pages and anon/file pages, while the patches always reclaim
MADV_FREE pages first if there are. I described the reason in patch 5.

For the third issue, we can add a separate RSS count for MADV_FREE pages. The
count will be increased in madvise syscall and decreased in page reclaim (eg,
unmap). One issue is activate_page(). A MADV_FREE page can be promoted to
active page there. But there isn't mm_struct context at that place. Iterating
vma there sounds too silly. The patchset don't fix this issue yet. Hopefully
somebody can share a hint how to fix this issue.

Thanks,
Shaohua

Minchan previous patches:
http://marc.info/?l=linux-mm=144800657002763=2

Shaohua Li (6):
  mm: add wrap for page accouting index
  mm: add lazyfree page flag
  mm: add LRU_LAZYFREE lru list
  mm: move MADV_FREE pages into LRU_LAZYFREE list
  mm: reclaim lazyfree pages
  mm: enable MADV_FREE for swapless system

 drivers/base/node.c   |  2 +
 drivers/staging/android/lowmemorykiller.c |  3 +-
 fs/proc/meminfo.c |  1 +
 fs/proc/task_mmu.c|  8 ++-
 include/linux/mm_inline.h | 41 +
 include/linux/mmzone.h|  9 +++
 include/linux/page-flags.h|  6 ++
 include/linux/swap.h  |  2 +-
 include/linux/vm_event_item.h |  2 +-
 include/trace/events/mmflags.h|  1 +
 include/trace/events/vmscan.h | 31 +-
 kernel/power/snapshot.c   |  1 +
 mm/compaction.c   | 11 ++--
 mm/huge_memory.c  |  6 +-
 mm/khugepaged.c   |  6 +-
 mm/madvise.c  | 11 +---
 mm/memcontrol.c   |  4 ++
 mm/memory-failure.c   |  3 +-
 mm/memory_hotplug.c   |  3 +-
 mm/mempolicy.c|  3 +-
 mm/migrate.c  | 29 --
 mm/page_alloc.c   | 10 
 mm/rmap.c |  7 ++-
 mm/swap.c | 51 +---
 mm/vmscan.c   | 96 +++
 mm/vmstat.c   |  4 ++
 26 files changed, 242 insertions(+), 109 deletions(-)

-- 
2.9.3



Re: [GIT PULL] STi DT fix for v4.10-rcs???

2017-01-29 Thread Olof Johansson
On Fri, Jan 27, 2017 at 04:10:10PM +, Patrice CHOTARD wrote:
> Hi Arnd, Olof, Kevin
> 
> Please consider this set for inclusion into the next v4.10-rc.
> 
> The following changes since commit 7ce7d89f48834cefece7804d38fc5d85382edf77:
> 
>Linux 4.10-rc1 (2016-12-25 16:13:08 -0800)
> 
> are available in the git repository at:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> tags/sti-dt-for-v4.10-rc
> 
> for you to fetch changes up to 8413299cb3933dade6186bbee8363f190032107e:
> 
>ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk (2017-01-27 
> 16:17:13 +0100)
> 
> 
> 
> STi DT fix:
> 
> Since v4.10-rc1, xhci is complaining in loop with :
> [ 801.953836] usb usb6-port1: Cannot enable. Maybe the USB cable is bad?
> [ 801.960455] xhci-hcd xhci-hcd.0.auto: Cannot set link state.
> [ 801.966611] usb usb6-port1: cannot disable (err = -32)
> 
> Set property "snps,dis_u3_susphy_quirk" in DT fix it.
> 
> 
> Patrice Chotard (1):
>ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk

Merged, thanks.


-Olof



Re: [GIT PULL] STi DT fix for v4.10-rcs???

2017-01-29 Thread Olof Johansson
On Fri, Jan 27, 2017 at 04:10:10PM +, Patrice CHOTARD wrote:
> Hi Arnd, Olof, Kevin
> 
> Please consider this set for inclusion into the next v4.10-rc.
> 
> The following changes since commit 7ce7d89f48834cefece7804d38fc5d85382edf77:
> 
>Linux 4.10-rc1 (2016-12-25 16:13:08 -0800)
> 
> are available in the git repository at:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/pchotard/sti.git 
> tags/sti-dt-for-v4.10-rc
> 
> for you to fetch changes up to 8413299cb3933dade6186bbee8363f190032107e:
> 
>ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk (2017-01-27 
> 16:17:13 +0100)
> 
> 
> 
> STi DT fix:
> 
> Since v4.10-rc1, xhci is complaining in loop with :
> [ 801.953836] usb usb6-port1: Cannot enable. Maybe the USB cable is bad?
> [ 801.960455] xhci-hcd xhci-hcd.0.auto: Cannot set link state.
> [ 801.966611] usb usb6-port1: cannot disable (err = -32)
> 
> Set property "snps,dis_u3_susphy_quirk" in DT fix it.
> 
> 
> Patrice Chotard (1):
>ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk

Merged, thanks.


-Olof



Re: [GIT PULL 2/4] soc: samsung: exynos for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
On Sun, Jan 29, 2017 at 10:06:27PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> On top of previous pull request (tags/samsung-drivers-soc-pmu-4.11).
> 
> This adds support for Exynos5433 to PMU driver which is needed
> by Marek's patchset:
>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> () samsung ! com
> 
> 
> Cc: Marek Szyprowski 
> Cc: Sylwester Nawrocki 
> Cc: Linus Walleij 
> Cc: Tomasz Figa 
> Cc: Lee Jones 
> 
> Best regards,
> Krzysztof

Merged, thanks!


-Olof



Re: [GIT PULL] ARM: at91: drivers for 4.11 #2

2017-01-29 Thread Olof Johansson
On Sat, Jan 28, 2017 at 01:45:10AM +0100, Alexandre Belloni wrote:
> Arnd, Olof
> 
> The EBI was requiring more fixing than expected in preparation of the
> NAND driver rework for 4.12.
> 
> The following changes since commit ee194289502a6901cc77dc9a893bf2afd351ac5e:
> 
>   memory/atmel-ebi: Fix ns <-> cycles conversions (2017-01-10 16:01:34 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux.git 
> tags/at91-ab-4.11-drivers2
> 
> for you to fetch changes up to 87108dc78eb8935b5cebab70f8158807d5a7617f:
> 
>   memory: atmel-ebi: Enable the SMC clock if specified (2017-01-27 10:28:54 
> +0100)

Merged, thanks!


-Olof



Re: [GIT PULL 2/4] soc: samsung: exynos for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
On Sun, Jan 29, 2017 at 10:06:27PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> On top of previous pull request (tags/samsung-drivers-soc-pmu-4.11).
> 
> This adds support for Exynos5433 to PMU driver which is needed
> by Marek's patchset:
>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> () samsung ! com
> 
> 
> Cc: Marek Szyprowski 
> Cc: Sylwester Nawrocki 
> Cc: Linus Walleij 
> Cc: Tomasz Figa 
> Cc: Lee Jones 
> 
> Best regards,
> Krzysztof

Merged, thanks!


-Olof



Re: [GIT PULL] ARM: at91: drivers for 4.11 #2

2017-01-29 Thread Olof Johansson
On Sat, Jan 28, 2017 at 01:45:10AM +0100, Alexandre Belloni wrote:
> Arnd, Olof
> 
> The EBI was requiring more fixing than expected in preparation of the
> NAND driver rework for 4.12.
> 
> The following changes since commit ee194289502a6901cc77dc9a893bf2afd351ac5e:
> 
>   memory/atmel-ebi: Fix ns <-> cycles conversions (2017-01-10 16:01:34 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux.git 
> tags/at91-ab-4.11-drivers2
> 
> for you to fetch changes up to 87108dc78eb8935b5cebab70f8158807d5a7617f:
> 
>   memory: atmel-ebi: Enable the SMC clock if specified (2017-01-27 10:28:54 
> +0100)

Merged, thanks!


-Olof



Re: [RESEND PATCH 0/2] arm64: defconfig: enable some NAND config symbols

2017-01-29 Thread Olof Johansson
On Mon, Jan 30, 2017 at 10:55:45AM +0900, Masahiro Yamada wrote:
> I am resending this series to a...@kernel.org
> as requested by Olof Johansson.
> 
> No code change since the previous post.
> 
> 
> 
> Masahiro Yamada (2):
>   arm64: defconfig: enable CONFIG_MTD_BLOCK
>   arm64: defconfig: enable CONFIG_MTD_NAND and CONFIG_MTD_NAND_DENALI_DT
> 
>  arch/arm64/configs/defconfig | 3 +++
>  1 file changed, 3 insertions(+)

Applied, thanks!


-Olof



Re: [RESEND PATCH 0/2] arm64: defconfig: enable some NAND config symbols

2017-01-29 Thread Olof Johansson
On Mon, Jan 30, 2017 at 10:55:45AM +0900, Masahiro Yamada wrote:
> I am resending this series to a...@kernel.org
> as requested by Olof Johansson.
> 
> No code change since the previous post.
> 
> 
> 
> Masahiro Yamada (2):
>   arm64: defconfig: enable CONFIG_MTD_BLOCK
>   arm64: defconfig: enable CONFIG_MTD_NAND and CONFIG_MTD_NAND_DENALI_DT
> 
>  arch/arm64/configs/defconfig | 3 +++
>  1 file changed, 3 insertions(+)

Applied, thanks!


-Olof



Re: [GIT PULL 1/4] ARM: mach/soc: exynos for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
On Sun, Jan 29, 2017 at 10:06:30PM +0200, Krzysztof Kozlowski wrote:
> 
> On top of previous pull request.
> 
> 
> The following changes since commit cda1a52dab50340728e46601e6c9da9fc4beaf1f:
> 
>   ARM: s3c64xx: Constify wake_irqs (2016-12-29 15:41:44 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-soc-4.11-2
> 
> for you to fetch changes up to 95ed41df129ca8f76eb190534d20f4a6dcd37213:
> 
>   dt-bindings: video: exynos7-decon: Remove obsolete samsung,power-domain 
> property (2017-01-29 21:03:56 +0200)
> 
> 
> Samsung mach/soc update for v4.11, second round:
> 1. Remove mach code for Exynos4415 as a continuation of removal
>of this SoC.
> 2. Remove obsolete property from the bindings documentation.

Merged, thanks!


-Olof



Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
Hi Krzysztof,

On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> On top of previous pull request.
> 
> This adds proper clocks to LPASS node on Exynos5433 which is needed
> by Marek's patchset:
>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> () samsung ! com
> 
> 
> Cc: Marek Szyprowski 
> Cc: Sylwester Nawrocki 
> Cc: Linus Walleij 
> Cc: Tomasz Figa 
> Cc: Lee Jones 
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
> 
>   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
> 18:20:28 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt64-4.11-2
> 
> for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
> 
>   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
> 22:04:20 +0200)

I think you tagged the wrong branch here. The log message shows the right hash
at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
seems to be on your for-next.

Care to respin, so your tag and our merged branch match up?


Thanks!


-Olof



Re: [PATCH 0/2] xen/net: limit number of tx/rx queues

2017-01-29 Thread Boris Ostrovsky



On 01/10/2017 08:32 AM, Juergen Gross wrote:

The Xen network frontend/backend supports multiple tx/rx queues for one
virtual interface. The number of queues supported by the backend is
set to the number of cpus of the backend driver domain (usually dom0)
and the number of queues requested by the frontend is limited by the
number of vcpus of the related guest.

On large systems this can lead to ridiculous large number of queues
exhausting the required number of grant pages rather quick.

To avoid this limit the default maximum on both sides to 8. Both
frontend and backend maximum can be individually tuned via module
parameters.

Juergen Gross (2):
  xen/netfront: set default upper limit of tx/rx queues to 8
  xen/netback: set default upper limit of tx/rx queues to 8

 drivers/net/xen-netback/netback.c | 6 --
 drivers/net/xen-netfront.c| 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)



Applied to for-linus-4.11

-boris


Re: [GIT PULL 1/4] ARM: mach/soc: exynos for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
On Sun, Jan 29, 2017 at 10:06:30PM +0200, Krzysztof Kozlowski wrote:
> 
> On top of previous pull request.
> 
> 
> The following changes since commit cda1a52dab50340728e46601e6c9da9fc4beaf1f:
> 
>   ARM: s3c64xx: Constify wake_irqs (2016-12-29 15:41:44 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-soc-4.11-2
> 
> for you to fetch changes up to 95ed41df129ca8f76eb190534d20f4a6dcd37213:
> 
>   dt-bindings: video: exynos7-decon: Remove obsolete samsung,power-domain 
> property (2017-01-29 21:03:56 +0200)
> 
> 
> Samsung mach/soc update for v4.11, second round:
> 1. Remove mach code for Exynos4415 as a continuation of removal
>of this SoC.
> 2. Remove obsolete property from the bindings documentation.

Merged, thanks!


-Olof



Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-29 Thread Olof Johansson
Hi Krzysztof,

On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> On top of previous pull request.
> 
> This adds proper clocks to LPASS node on Exynos5433 which is needed
> by Marek's patchset:
>  - [PATCH v2 0/8] Pad retentions support for Exynos5433
>https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> () samsung ! com
> 
> 
> Cc: Marek Szyprowski 
> Cc: Sylwester Nawrocki 
> Cc: Linus Walleij 
> Cc: Tomasz Figa 
> Cc: Lee Jones 
> 
> Best regards,
> Krzysztof
> 
> 
> The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
> 
>   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
> 18:20:28 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> tags/samsung-dt64-4.11-2
> 
> for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
> 
>   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
> 22:04:20 +0200)

I think you tagged the wrong branch here. The log message shows the right hash
at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
seems to be on your for-next.

Care to respin, so your tag and our merged branch match up?


Thanks!


-Olof



Re: [PATCH 0/2] xen/net: limit number of tx/rx queues

2017-01-29 Thread Boris Ostrovsky



On 01/10/2017 08:32 AM, Juergen Gross wrote:

The Xen network frontend/backend supports multiple tx/rx queues for one
virtual interface. The number of queues supported by the backend is
set to the number of cpus of the backend driver domain (usually dom0)
and the number of queues requested by the frontend is limited by the
number of vcpus of the related guest.

On large systems this can lead to ridiculous large number of queues
exhausting the required number of grant pages rather quick.

To avoid this limit the default maximum on both sides to 8. Both
frontend and backend maximum can be individually tuned via module
parameters.

Juergen Gross (2):
  xen/netfront: set default upper limit of tx/rx queues to 8
  xen/netback: set default upper limit of tx/rx queues to 8

 drivers/net/xen-netback/netback.c | 6 --
 drivers/net/xen-netfront.c| 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)



Applied to for-linus-4.11

-boris


  1   2   3   4   5   6   >