date:20170828

Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Krzysztof Kozlowski

On Tue, Aug 29, 2017 at 4:52 AM, Hoegeun Kwon  wrote:
> Hi Krzysztof,
>
> The driver has been merged into exynos-drm-misc.
> Could you please check this patch(3/3).

Hi, OK, no problems for me but it is too late for current cycle so it
will go in for v4.15.

Best regards,
Krzysztof

>
> Best regards,
> Hoegeun
>
>
> On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:
>>
>> The display-timing and delay are included in the panel driver. So it
>> should be removed in dts.
>>
>> Signed-off-by: Hoegeun Kwon 
>> ---
>>   arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
>>   1 file changed, 22 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts
>> b/arch/arm/boot/dts/exynos3250-rinato.dts
>> index 443e0c9..6b70c8d 100644
>> --- a/arch/arm/boot/dts/exynos3250-rinato.dts
>> +++ b/arch/arm/boot/dts/exynos3250-rinato.dts
>> @@ -242,28 +242,6 @@
>> vci-supply = <_reg>;
>> reset-gpios = < 1 GPIO_ACTIVE_LOW>;
>> te-gpios = < 6 GPIO_ACTIVE_HIGH>;
>> -   power-on-delay= <30>;
>> -   power-off-delay= <120>;
>> -   reset-delay = <5>;
>> -   init-delay = <100>;
>> -   flip-horizontal;
>> -   flip-vertical;
>> -   panel-width-mm = <29>;
>> -   panel-height-mm = <29>;
>> -
>> -   display-timings {
>> -   timing-0 {
>> -   clock-frequency = <460>;
>> -   hactive = <320>;
>> -   vactive = <320>;
>> -   hfront-porch = <1>;
>> -   hback-porch = <1>;
>> -   hsync-len = <1>;
>> -   vfront-porch = <150>;
>> -   vback-porch = <1>;
>> -   vsync-len = <2>;
>> -   };
>> -   };
>> port {
>> dsi_in: endpoint {
>
>

Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Krzysztof Kozlowski

On Tue, Aug 29, 2017 at 4:52 AM, Hoegeun Kwon  wrote:
> Hi Krzysztof,
>
> The driver has been merged into exynos-drm-misc.
> Could you please check this patch(3/3).

Hi, OK, no problems for me but it is too late for current cycle so it
will go in for v4.15.

Best regards,
Krzysztof

>
> Best regards,
> Hoegeun
>
>
> On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:
>>
>> The display-timing and delay are included in the panel driver. So it
>> should be removed in dts.
>>
>> Signed-off-by: Hoegeun Kwon 
>> ---
>>   arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
>>   1 file changed, 22 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts
>> b/arch/arm/boot/dts/exynos3250-rinato.dts
>> index 443e0c9..6b70c8d 100644
>> --- a/arch/arm/boot/dts/exynos3250-rinato.dts
>> +++ b/arch/arm/boot/dts/exynos3250-rinato.dts
>> @@ -242,28 +242,6 @@
>> vci-supply = <_reg>;
>> reset-gpios = < 1 GPIO_ACTIVE_LOW>;
>> te-gpios = < 6 GPIO_ACTIVE_HIGH>;
>> -   power-on-delay= <30>;
>> -   power-off-delay= <120>;
>> -   reset-delay = <5>;
>> -   init-delay = <100>;
>> -   flip-horizontal;
>> -   flip-vertical;
>> -   panel-width-mm = <29>;
>> -   panel-height-mm = <29>;
>> -
>> -   display-timings {
>> -   timing-0 {
>> -   clock-frequency = <460>;
>> -   hactive = <320>;
>> -   vactive = <320>;
>> -   hfront-porch = <1>;
>> -   hback-porch = <1>;
>> -   hsync-len = <1>;
>> -   vfront-porch = <150>;
>> -   vback-porch = <1>;
>> -   vsync-len = <2>;
>> -   };
>> -   };
>> port {
>> dsi_in: endpoint {
>
>

[PATCH 4/4] [media] zr364xx: Fix a typo in a comment line of the file header

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:46:30 +0200

Fix a word in this description.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 4cc6d2a9d91f..4ccf71d8b608 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -2,7 +2,7 @@
  * Zoran 364xx based USB webcam module version 0.73
  *
  * Allows you to use your USB webcam with V4L2 applications
- * This is still in heavy developpement !
+ * This is still in heavy development!
  *
  * Copyright (C) 2004  Antoine Jacquet 
  * http://royale.zerezo.com/zr364xx/
-- 
2.14.1

[PATCH 4/4] [media] zr364xx: Fix a typo in a comment line of the file header

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:46:30 +0200

Fix a word in this description.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 4cc6d2a9d91f..4ccf71d8b608 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -2,7 +2,7 @@
  * Zoran 364xx based USB webcam module version 0.73
  *
  * Allows you to use your USB webcam with V4L2 applications
- * This is still in heavy developpement !
+ * This is still in heavy development!
  *
  * Copyright (C) 2004  Antoine Jacquet 
  * http://royale.zerezo.com/zr364xx/
-- 
2.14.1

[PATCH 3/4] [media] zr364xx: Adjust ten checks for null pointers

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:40:47 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 37cd6e20e68a..4cc6d2a9d91f 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -385,9 +385,9 @@ static int buffer_prepare(struct videobuf_queue *vq, struct 
videobuf_buffer *vb,
  vb);
int rc;
 
-   DBG("%s, field=%d, fmt name = %s\n", __func__, field, cam->fmt != NULL ?
-   cam->fmt->name : "");
-   if (cam->fmt == NULL)
+   DBG("%s, field=%d, fmt name = %s\n", __func__, field,
+   cam->fmt ? cam->fmt->name : "");
+   if (!cam->fmt)
return -EINVAL;
 
buf->vb.size = cam->width * cam->height * (cam->fmt->depth >> 3);
@@ -787,7 +787,7 @@ static int zr364xx_vidioc_try_fmt_vid_cap(struct file 
*file, void *priv,
struct zr364xx_camera *cam = video_drvdata(file);
char pixelformat_name[5];
 
-   if (cam == NULL)
+   if (!cam)
return -ENODEV;
 
if (f->fmt.pix.pixelformat != V4L2_PIX_FMT_JPEG) {
@@ -817,7 +817,7 @@ static int zr364xx_vidioc_g_fmt_vid_cap(struct file *file, 
void *priv,
 {
struct zr364xx_camera *cam;
 
-   if (file == NULL)
+   if (!file)
return -ENODEV;
cam = video_drvdata(file);
 
@@ -979,13 +979,13 @@ static void read_pipe_completion(struct urb *purb)
 
pipe_info = purb->context;
_DBG("%s %p, status %d\n", __func__, purb, purb->status);
-   if (pipe_info == NULL) {
+   if (!pipe_info) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
 
cam = pipe_info->cam;
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
@@ -1069,7 +1069,7 @@ static void zr364xx_stop_readpipe(struct zr364xx_camera 
*cam)
 {
struct zr364xx_pipeinfo *pipe_info;
 
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": invalid device\n");
return;
}
@@ -1273,7 +1273,7 @@ static int zr364xx_mmap(struct file *file, struct 
vm_area_struct *vma)
struct zr364xx_camera *cam = video_drvdata(file);
int ret;
 
-   if (cam == NULL) {
+   if (!cam) {
DBG("%s: cam == NULL\n", __func__);
return -ENODEV;
}
@@ -1357,7 +1357,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
 
pipe->transfer_buffer = kzalloc(pipe->transfer_size,
GFP_KERNEL);
-   if (pipe->transfer_buffer == NULL) {
+   if (!pipe->transfer_buffer) {
DBG("out of memory!\n");
return -ENOMEM;
}
@@ -1373,7 +1373,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
DBG("valloc %p, idx %lu, pdata %p\n",
>buffer.frame[i], i,
cam->buffer.frame[i].lpvbits);
-   if (cam->buffer.frame[i].lpvbits == NULL) {
+   if (!cam->buffer.frame[i].lpvbits) {
printk(KERN_INFO KBUILD_MODNAME ": out of memory. Using 
less frames\n");
break;
}
-- 
2.14.1

[PATCH 3/4] [media] zr364xx: Adjust ten checks for null pointers

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:40:47 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 37cd6e20e68a..4cc6d2a9d91f 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -385,9 +385,9 @@ static int buffer_prepare(struct videobuf_queue *vq, struct 
videobuf_buffer *vb,
  vb);
int rc;
 
-   DBG("%s, field=%d, fmt name = %s\n", __func__, field, cam->fmt != NULL ?
-   cam->fmt->name : "");
-   if (cam->fmt == NULL)
+   DBG("%s, field=%d, fmt name = %s\n", __func__, field,
+   cam->fmt ? cam->fmt->name : "");
+   if (!cam->fmt)
return -EINVAL;
 
buf->vb.size = cam->width * cam->height * (cam->fmt->depth >> 3);
@@ -787,7 +787,7 @@ static int zr364xx_vidioc_try_fmt_vid_cap(struct file 
*file, void *priv,
struct zr364xx_camera *cam = video_drvdata(file);
char pixelformat_name[5];
 
-   if (cam == NULL)
+   if (!cam)
return -ENODEV;
 
if (f->fmt.pix.pixelformat != V4L2_PIX_FMT_JPEG) {
@@ -817,7 +817,7 @@ static int zr364xx_vidioc_g_fmt_vid_cap(struct file *file, 
void *priv,
 {
struct zr364xx_camera *cam;
 
-   if (file == NULL)
+   if (!file)
return -ENODEV;
cam = video_drvdata(file);
 
@@ -979,13 +979,13 @@ static void read_pipe_completion(struct urb *purb)
 
pipe_info = purb->context;
_DBG("%s %p, status %d\n", __func__, purb, purb->status);
-   if (pipe_info == NULL) {
+   if (!pipe_info) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
 
cam = pipe_info->cam;
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": no context!\n");
return;
}
@@ -1069,7 +1069,7 @@ static void zr364xx_stop_readpipe(struct zr364xx_camera 
*cam)
 {
struct zr364xx_pipeinfo *pipe_info;
 
-   if (cam == NULL) {
+   if (!cam) {
printk(KERN_ERR KBUILD_MODNAME ": invalid device\n");
return;
}
@@ -1273,7 +1273,7 @@ static int zr364xx_mmap(struct file *file, struct 
vm_area_struct *vma)
struct zr364xx_camera *cam = video_drvdata(file);
int ret;
 
-   if (cam == NULL) {
+   if (!cam) {
DBG("%s: cam == NULL\n", __func__);
return -ENODEV;
}
@@ -1357,7 +1357,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
 
pipe->transfer_buffer = kzalloc(pipe->transfer_size,
GFP_KERNEL);
-   if (pipe->transfer_buffer == NULL) {
+   if (!pipe->transfer_buffer) {
DBG("out of memory!\n");
return -ENOMEM;
}
@@ -1373,7 +1373,7 @@ static int zr364xx_board_init(struct zr364xx_camera *cam)
DBG("valloc %p, idx %lu, pdata %p\n",
>buffer.frame[i], i,
cam->buffer.frame[i].lpvbits);
-   if (cam->buffer.frame[i].lpvbits == NULL) {
+   if (!cam->buffer.frame[i].lpvbits) {
printk(KERN_INFO KBUILD_MODNAME ": out of memory. Using 
less frames\n");
break;
}
-- 
2.14.1

[PATCH 2/4] [media] zr364xx: Improve a size determination in zr364xx_probe()

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:28:02 +0200

Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 97af697dcc81..37cd6e20e68a 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -1421,7 +1421,7 @@ static int zr364xx_probe(struct usb_interface *intf,
 le16_to_cpu(udev->descriptor.idVendor),
 le16_to_cpu(udev->descriptor.idProduct));
 
-   cam = kzalloc(sizeof(struct zr364xx_camera), GFP_KERNEL);
+   cam = kzalloc(sizeof(*cam), GFP_KERNEL);
if (!cam)
return -ENOMEM;
 
-- 
2.14.1

[PATCH 2/4] [media] zr364xx: Improve a size determination in zr364xx_probe()

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:28:02 +0200

Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index 97af697dcc81..37cd6e20e68a 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -1421,7 +1421,7 @@ static int zr364xx_probe(struct usb_interface *intf,
 le16_to_cpu(udev->descriptor.idVendor),
 le16_to_cpu(udev->descriptor.idProduct));
 
-   cam = kzalloc(sizeof(struct zr364xx_camera), GFP_KERNEL);
+   cam = kzalloc(sizeof(*cam), GFP_KERNEL);
if (!cam)
return -ENOMEM;
 
-- 
2.14.1

Re: [PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP

2017-08-28 Thread Oza Oza

On Tue, Aug 29, 2017 at 3:17 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:25AM +0530, Oza Pawandeep wrote:
>> PCIe spec r3.1, sec 2.3.2
>> If CRS software visibility is not enabled, the RC must reissue the
>> config request as a new request.
>>
>> - If CRS software visibility is enabled,
>> - for a config read of Vendor ID, the RC must return 0x0001 data
>> - for all other config reads/writes, the RC must reissue the
>>   request
>>
>> iproc PCIe Controller spec:
>> 4.7.3.3. Retry Status On Configuration Cycle
>> Endpoints are allowed to generate retry status on configuration
>> cycles. In this case, the RC needs to re-issue the request. The IP
>> does not handle this because the number of configuration cycles needed
>> will probably be less than the total number of non-posted operations
>> needed.
>>
>> When a retry status is received on the User RX interface for a
>> configuration request that was sent on the User TX interface,
>> it will be indicated with a completion with the CMPL_STATUS field set
>> to 2=CRS, and the user will have to find the address and data values
>> and send a new transaction on the User TX interface.
>> When the internal configuration space returns a retry status during a
>> configuration cycle (user_cscfg = 1) on the Command/Status interface,
>> the pcie_cscrs will assert with the pcie_csack signal to indicate the
>> CRS status.
>> When the CRS Software Visibility Enable register in the Root Control
>> register is enabled, the IP will return the data value to 0x0001 for
>> the Vendor ID value and 0x  (all 1’s) for the rest of the data in
>> the request for reads of offset 0 that return with CRS status.  This
>> is true for both the User RX Interface and for the Command/Status
>> interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
>> field of the completion on the User RX Interface will not be 2=CRS and
>> the pcie_cscrs signal will not assert on the Command/Status interface.
>>
>> Per PCIe r3.1, sec 2.3.2, config requests that receive completions
>> with Configuration Request Retry Status (CRS) should be reissued by
>> the hardware except reads of the Vendor ID when CRS Software
>> Visibility is enabled.
>>
>> This hardware never reissues configuration requests when it receives
>> CRS completions.
>> Note that, neither PCIe host bridge nor PCIe core re-issues the
>> request for any configuration offset.
>>
>> For config reads, this hardware returns CFG_RETRY_STATUS data when
>> it receives a CRS completion for a config read, regardless of the
>> address of the read or the CRS Software Visibility Enable bit.
>>
>> This patch implements iproc_pcie_config_read which gets called for
>> Stingray, if it receives a CRS completion, it retries reading it again.
>> In case of timeout, it returns 0x.
>> For other iproc based SOC, it falls back to PCI generic APIs.
>>
>> Signed-off-by: Oza Pawandeep 
>>
>> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
>> index 61d9be6..37f4adf 100644
>> --- a/drivers/pci/host/pcie-iproc.c
>> +++ b/drivers/pci/host/pcie-iproc.c
>> @@ -68,6 +68,9 @@
>>  #define APB_ERR_EN_SHIFT 0
>>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>>
>> +#define CFG_RETRY_STATUS 0x0001
>> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
>> +
>>  /* derive the enum index of the outbound/inbound mapping registers */
>>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
>>
>> @@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
>> iproc_pcie *pcie,
>>   return (pcie->base + offset);
>>  }
>>
>> +static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
>> +{
>> + int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
>> + unsigned int data;
>> +
>> + /*
>> +  * As per PCIe spec r3.1, sec 2.3.2, CRS Software
>> +  * Visibility only affects config read of the Vendor ID.
>> +  * For config write or any other config read the Root must
>> +  * automatically re-issue configuration request again as a
>> +  * new request.
>> +  *
>> +  * For config reads, this hardware returns CFG_RETRY_STATUS data when
>> +  * it receives a CRS completion for a config read, regardless of the
>> +  * address of the read or the CRS Software Visibility Enable bit. As a
>> +  * partial workaround for this, we retry in software any read that
>> +  * returns CFG_RETRY_STATUS.
>> +  */
>> + data = readl(cfg_data_p);
>> + while (data == CFG_RETRY_STATUS && timeout--) {
>> + udelay(1);
>> + data = readl(cfg_data_p);
>> + }
>> +
>> + if (data == CFG_RETRY_STATUS)
>> + data = 0x;
>> +
>> + return data;
>> +}
>> +
>> +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
>> + int where, int size, u32 *val)
>> +{
>> + struct

Re: [PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP

2017-08-28 Thread Oza Oza

On Tue, Aug 29, 2017 at 3:17 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:25AM +0530, Oza Pawandeep wrote:
>> PCIe spec r3.1, sec 2.3.2
>> If CRS software visibility is not enabled, the RC must reissue the
>> config request as a new request.
>>
>> - If CRS software visibility is enabled,
>> - for a config read of Vendor ID, the RC must return 0x0001 data
>> - for all other config reads/writes, the RC must reissue the
>>   request
>>
>> iproc PCIe Controller spec:
>> 4.7.3.3. Retry Status On Configuration Cycle
>> Endpoints are allowed to generate retry status on configuration
>> cycles. In this case, the RC needs to re-issue the request. The IP
>> does not handle this because the number of configuration cycles needed
>> will probably be less than the total number of non-posted operations
>> needed.
>>
>> When a retry status is received on the User RX interface for a
>> configuration request that was sent on the User TX interface,
>> it will be indicated with a completion with the CMPL_STATUS field set
>> to 2=CRS, and the user will have to find the address and data values
>> and send a new transaction on the User TX interface.
>> When the internal configuration space returns a retry status during a
>> configuration cycle (user_cscfg = 1) on the Command/Status interface,
>> the pcie_cscrs will assert with the pcie_csack signal to indicate the
>> CRS status.
>> When the CRS Software Visibility Enable register in the Root Control
>> register is enabled, the IP will return the data value to 0x0001 for
>> the Vendor ID value and 0x  (all 1’s) for the rest of the data in
>> the request for reads of offset 0 that return with CRS status.  This
>> is true for both the User RX Interface and for the Command/Status
>> interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
>> field of the completion on the User RX Interface will not be 2=CRS and
>> the pcie_cscrs signal will not assert on the Command/Status interface.
>>
>> Per PCIe r3.1, sec 2.3.2, config requests that receive completions
>> with Configuration Request Retry Status (CRS) should be reissued by
>> the hardware except reads of the Vendor ID when CRS Software
>> Visibility is enabled.
>>
>> This hardware never reissues configuration requests when it receives
>> CRS completions.
>> Note that, neither PCIe host bridge nor PCIe core re-issues the
>> request for any configuration offset.
>>
>> For config reads, this hardware returns CFG_RETRY_STATUS data when
>> it receives a CRS completion for a config read, regardless of the
>> address of the read or the CRS Software Visibility Enable bit.
>>
>> This patch implements iproc_pcie_config_read which gets called for
>> Stingray, if it receives a CRS completion, it retries reading it again.
>> In case of timeout, it returns 0x.
>> For other iproc based SOC, it falls back to PCI generic APIs.
>>
>> Signed-off-by: Oza Pawandeep 
>>
>> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
>> index 61d9be6..37f4adf 100644
>> --- a/drivers/pci/host/pcie-iproc.c
>> +++ b/drivers/pci/host/pcie-iproc.c
>> @@ -68,6 +68,9 @@
>>  #define APB_ERR_EN_SHIFT 0
>>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>>
>> +#define CFG_RETRY_STATUS 0x0001
>> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
>> +
>>  /* derive the enum index of the outbound/inbound mapping registers */
>>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
>>
>> @@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
>> iproc_pcie *pcie,
>>   return (pcie->base + offset);
>>  }
>>
>> +static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
>> +{
>> + int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
>> + unsigned int data;
>> +
>> + /*
>> +  * As per PCIe spec r3.1, sec 2.3.2, CRS Software
>> +  * Visibility only affects config read of the Vendor ID.
>> +  * For config write or any other config read the Root must
>> +  * automatically re-issue configuration request again as a
>> +  * new request.
>> +  *
>> +  * For config reads, this hardware returns CFG_RETRY_STATUS data when
>> +  * it receives a CRS completion for a config read, regardless of the
>> +  * address of the read or the CRS Software Visibility Enable bit. As a
>> +  * partial workaround for this, we retry in software any read that
>> +  * returns CFG_RETRY_STATUS.
>> +  */
>> + data = readl(cfg_data_p);
>> + while (data == CFG_RETRY_STATUS && timeout--) {
>> + udelay(1);
>> + data = readl(cfg_data_p);
>> + }
>> +
>> + if (data == CFG_RETRY_STATUS)
>> + data = 0x;
>> +
>> + return data;
>> +}
>> +
>> +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
>> + int where, int size, u32 *val)
>> +{
>> + struct iproc_pcie *pcie = iproc_data(bus);
>> +

[PATCH 1/4] [media] zr364xx: Delete an error message for a failed memory allocation in two functions

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:23:56 +0200

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index efdcd5bd6a4c..97af697dcc81 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -212,7 +212,5 @@ static int send_control_msg(struct usb_device *udev, u8 
request, u16 value,
-   if (!transfer_buffer) {
-   dev_err(>dev, "kmalloc(%d) failed\n", size);
+   if (!transfer_buffer)
return -ENOMEM;
-   }
 
memcpy(transfer_buffer, cp, size);
 
@@ -1427,7 +1425,5 @@ static int zr364xx_probe(struct usb_interface *intf,
-   if (cam == NULL) {
-   dev_err(>dev, "cam: out of memory !\n");
+   if (!cam)
return -ENOMEM;
-   }
 
cam->v4l2_dev.release = zr364xx_release;
err = v4l2_device_register(>dev, >v4l2_dev);
-- 
2.14.1

[PATCH 1/4] [media] zr364xx: Delete an error message for a failed memory allocation in two functions

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Mon, 28 Aug 2017 22:23:56 +0200

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/media/usb/zr364xx/zr364xx.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/media/usb/zr364xx/zr364xx.c 
b/drivers/media/usb/zr364xx/zr364xx.c
index efdcd5bd6a4c..97af697dcc81 100644
--- a/drivers/media/usb/zr364xx/zr364xx.c
+++ b/drivers/media/usb/zr364xx/zr364xx.c
@@ -212,7 +212,5 @@ static int send_control_msg(struct usb_device *udev, u8 
request, u16 value,
-   if (!transfer_buffer) {
-   dev_err(>dev, "kmalloc(%d) failed\n", size);
+   if (!transfer_buffer)
return -ENOMEM;
-   }
 
memcpy(transfer_buffer, cp, size);
 
@@ -1427,7 +1425,5 @@ static int zr364xx_probe(struct usb_interface *intf,
-   if (cam == NULL) {
-   dev_err(>dev, "cam: out of memory !\n");
+   if (!cam)
return -ENOMEM;
-   }
 
cam->v4l2_dev.release = zr364xx_release;
err = v4l2_device_register(>dev, >v4l2_dev);
-- 
2.14.1

[PATCH 0/4] [media] zr364xx: Adjustments for some function implementations

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 29 Aug 2017 07:17:07 +0200

A few update suggestions were taken into account
from static source code analysis.

Markus Elfring (4):
  Delete an error message for a failed memory allocation in two functions
  Improve a size determination in zr364xx_probe()
  Adjust ten checks for null pointers
  Fix a typo in a comment line of the file header

 drivers/media/usb/zr364xx/zr364xx.c | 34 +++---
 1 file changed, 15 insertions(+), 19 deletions(-)

-- 
2.14.1

[PATCH 0/4] [media] zr364xx: Adjustments for some function implementations

2017-08-28 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 29 Aug 2017 07:17:07 +0200

A few update suggestions were taken into account
from static source code analysis.

Markus Elfring (4):
  Delete an error message for a failed memory allocation in two functions
  Improve a size determination in zr364xx_probe()
  Adjust ten checks for null pointers
  Fix a typo in a comment line of the file header

 drivers/media/usb/zr364xx/zr364xx.c | 34 +++---
 1 file changed, 15 insertions(+), 19 deletions(-)

-- 
2.14.1

Re: [PATCH v8 0/3] PCI: iproc: SOC specific fixes

2017-08-28 Thread Oza Oza

On Tue, Aug 29, 2017 at 3:23 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:23AM +0530, Oza Pawandeep wrote:
>> PCI: iproc: Retry request when CRS returned from EP Above patch adds
>> support for CRS in PCI RC driver, otherwise if not handled at lower
>> level, the user space PMD (poll mode drivers) can timeout.
>>
>> PCI: iproc: add device shutdown for PCI RC This fixes the issue where
>> certian PCI endpoints are not getting detected on Stingray SOC after
>> reboot.
>>
>> Changes Since v7:
>> Factor out the ep config access code.
>>
>> Changes Since v6:
>> Rebased patches on top of Lorenzo's patches.
>> Bjorn's comments addressed.
>> now the confg retry returns 0x as data.
>> Added reference to PCIe spec and iproc Controller spec in Changelog.
>>
>> Changes Since v5:
>> Ray's comments addressed.
>>
>> Changes Since v4:
>> Bjorn's comments addressed.
>>
>> Changes Since v3:
>> [re-send]
>>
>> Changes Since v2:
>> Fix compilation errors for pcie-iproc-platform.ko which was caught
>> by kbuild.
>>
>> Oza Pawandeep (3):
>>   PCI: iproc: factor-out ep configuration access
>>   PCI: iproc: Retry request when CRS returned from EP
>>   PCI: iproc: add device shutdown for PCI RC
>>
>>  drivers/pci/host/pcie-iproc-platform.c |   8 ++
>>  drivers/pci/host/pcie-iproc.c  | 143 
>> ++---
>>  drivers/pci/host/pcie-iproc.h  |   1 +
>>  3 files changed, 124 insertions(+), 28 deletions(-)
>
> I applied these to pci/host-iproc for v4.14.  Man, this is ugly.
>
> I reworked the changelog to try to make it more readable.  I also tried to
> disable the PCI_EXP_RTCAP_CRSVIS bit, which advertises CRS SV support.  And
> I removed what looked like a duplicate pci_generic_config_read32() call.
> And I added a warning about the fact that we corrupt reads of config
> registers that happen to contain 0x0001.
>
> I'm pretty sure I broke something, so please take a look.

Appreciate your time in adding PCI_EXP_RTCAP_CRSVIS and other changes.
I just tested the patch, and it works fine.
which tells us, that CRS visibility bit has no effect.

so things look okay to me.

Regards,
Oza.
>
> Incremental diff from your v8 to what's on pci/host-iproc:
>
> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
> index cbdabe8a073e..8bd5e544b1c1 100644
> --- a/drivers/pci/host/pcie-iproc.c
> +++ b/drivers/pci/host/pcie-iproc.c
> @@ -69,7 +69,7 @@
>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>
>  #define CFG_RETRY_STATUS 0x0001
> -#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milliseconds */
>
>  /* derive the enum index of the outbound/inbound mapping registers */
>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
> @@ -482,17 +482,21 @@ static unsigned int iproc_pcie_cfg_retry(void __iomem 
> *cfg_data_p)
> unsigned int data;
>
> /*
> -* As per PCIe spec r3.1, sec 2.3.2, CRS Software
> -* Visibility only affects config read of the Vendor ID.
> -* For config write or any other config read the Root must
> -* automatically re-issue configuration request again as a
> -* new request.
> +* As per PCIe spec r3.1, sec 2.3.2, CRS Software Visibility only
> +* affects config reads of the Vendor ID.  For config writes or any
> +* other config reads, the Root may automatically reissue the
> +* configuration request again as a new request.
>  *
> -* For config reads, this hardware returns CFG_RETRY_STATUS data when
> -* it receives a CRS completion for a config read, regardless of the
> -* address of the read or the CRS Software Visibility Enable bit. As a
> +* For config reads, this hardware returns CFG_RETRY_STATUS data
> +* when it receives a CRS completion, regardless of the address of
> +* the read or the CRS Software Visibility Enable bit.  As a
>  * partial workaround for this, we retry in software any read that
>  * returns CFG_RETRY_STATUS.
> +*
> +* Note that a non-Vendor ID config register may have a value of
> +* CFG_RETRY_STATUS.  If we read that, we can't distinguish it from
> +* a CRS completion, so we will incorrectly retry the read and
> +* eventually return the wrong data (0x).
>  */
> data = readl(cfg_data_p);
> while (data == CFG_RETRY_STATUS && timeout--) {
> @@ -515,10 +519,19 @@ static int iproc_pcie_config_read(struct pci_bus *bus, 
> unsigned int devfn,
> unsigned int busno = bus->number;
> void __iomem *cfg_data_p;
> unsigned int data;
> +   int ret;
>
> -   /* root complex access. */
> -   if (busno == 0)
> -   return pci_generic_config_read32(bus, devfn, where, size, 
> val);
> +   /* root complex access */
> +   if

Re: [PATCH v8 0/3] PCI: iproc: SOC specific fixes

2017-08-28 Thread Oza Oza

On Tue, Aug 29, 2017 at 3:23 AM, Bjorn Helgaas  wrote:
> On Thu, Aug 24, 2017 at 10:34:23AM +0530, Oza Pawandeep wrote:
>> PCI: iproc: Retry request when CRS returned from EP Above patch adds
>> support for CRS in PCI RC driver, otherwise if not handled at lower
>> level, the user space PMD (poll mode drivers) can timeout.
>>
>> PCI: iproc: add device shutdown for PCI RC This fixes the issue where
>> certian PCI endpoints are not getting detected on Stingray SOC after
>> reboot.
>>
>> Changes Since v7:
>> Factor out the ep config access code.
>>
>> Changes Since v6:
>> Rebased patches on top of Lorenzo's patches.
>> Bjorn's comments addressed.
>> now the confg retry returns 0x as data.
>> Added reference to PCIe spec and iproc Controller spec in Changelog.
>>
>> Changes Since v5:
>> Ray's comments addressed.
>>
>> Changes Since v4:
>> Bjorn's comments addressed.
>>
>> Changes Since v3:
>> [re-send]
>>
>> Changes Since v2:
>> Fix compilation errors for pcie-iproc-platform.ko which was caught
>> by kbuild.
>>
>> Oza Pawandeep (3):
>>   PCI: iproc: factor-out ep configuration access
>>   PCI: iproc: Retry request when CRS returned from EP
>>   PCI: iproc: add device shutdown for PCI RC
>>
>>  drivers/pci/host/pcie-iproc-platform.c |   8 ++
>>  drivers/pci/host/pcie-iproc.c  | 143 
>> ++---
>>  drivers/pci/host/pcie-iproc.h  |   1 +
>>  3 files changed, 124 insertions(+), 28 deletions(-)
>
> I applied these to pci/host-iproc for v4.14.  Man, this is ugly.
>
> I reworked the changelog to try to make it more readable.  I also tried to
> disable the PCI_EXP_RTCAP_CRSVIS bit, which advertises CRS SV support.  And
> I removed what looked like a duplicate pci_generic_config_read32() call.
> And I added a warning about the fact that we corrupt reads of config
> registers that happen to contain 0x0001.
>
> I'm pretty sure I broke something, so please take a look.

Appreciate your time in adding PCI_EXP_RTCAP_CRSVIS and other changes.
I just tested the patch, and it works fine.
which tells us, that CRS visibility bit has no effect.

so things look okay to me.

Regards,
Oza.
>
> Incremental diff from your v8 to what's on pci/host-iproc:
>
> diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
> index cbdabe8a073e..8bd5e544b1c1 100644
> --- a/drivers/pci/host/pcie-iproc.c
> +++ b/drivers/pci/host/pcie-iproc.c
> @@ -69,7 +69,7 @@
>  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
>
>  #define CFG_RETRY_STATUS 0x0001
> -#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
> +#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milliseconds */
>
>  /* derive the enum index of the outbound/inbound mapping registers */
>  #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
> @@ -482,17 +482,21 @@ static unsigned int iproc_pcie_cfg_retry(void __iomem 
> *cfg_data_p)
> unsigned int data;
>
> /*
> -* As per PCIe spec r3.1, sec 2.3.2, CRS Software
> -* Visibility only affects config read of the Vendor ID.
> -* For config write or any other config read the Root must
> -* automatically re-issue configuration request again as a
> -* new request.
> +* As per PCIe spec r3.1, sec 2.3.2, CRS Software Visibility only
> +* affects config reads of the Vendor ID.  For config writes or any
> +* other config reads, the Root may automatically reissue the
> +* configuration request again as a new request.
>  *
> -* For config reads, this hardware returns CFG_RETRY_STATUS data when
> -* it receives a CRS completion for a config read, regardless of the
> -* address of the read or the CRS Software Visibility Enable bit. As a
> +* For config reads, this hardware returns CFG_RETRY_STATUS data
> +* when it receives a CRS completion, regardless of the address of
> +* the read or the CRS Software Visibility Enable bit.  As a
>  * partial workaround for this, we retry in software any read that
>  * returns CFG_RETRY_STATUS.
> +*
> +* Note that a non-Vendor ID config register may have a value of
> +* CFG_RETRY_STATUS.  If we read that, we can't distinguish it from
> +* a CRS completion, so we will incorrectly retry the read and
> +* eventually return the wrong data (0x).
>  */
> data = readl(cfg_data_p);
> while (data == CFG_RETRY_STATUS && timeout--) {
> @@ -515,10 +519,19 @@ static int iproc_pcie_config_read(struct pci_bus *bus, 
> unsigned int devfn,
> unsigned int busno = bus->number;
> void __iomem *cfg_data_p;
> unsigned int data;
> +   int ret;
>
> -   /* root complex access. */
> -   if (busno == 0)
> -   return pci_generic_config_read32(bus, devfn, where, size, 
> val);
> +   /* root complex access */
> +   if (busno == 0) {
> +

Re: [Cocci] cocci: remove unnecessary casts of void * while avoiding casts with user or force ?

2017-08-28 Thread Julia Lawall

On Mon, 28 Aug 2017, Joe Perches wrote:

> A simple cocci script that removes unnecessary casts of
> a void * will also remove casts with __force or __user

Unfortunately, attributes are currently not supported inside casts.  This
can be done in a hackish way (possible false negatives) as follows:

---

@initialize:ocaml@
@@

let close (p1,p2) =
  let r = (List.hd p1).line_end in
  let l = (List.hd p2).line in
  let rc = (List.hd p1).col_end in
  let lc = (List.hd p2).col in
  r = l && lc = rc+1

@r@
position p1,p2;
expression f,e;
type T;
@@

f(..., // generalize this rule as needed
 (T@p1 *@p2)
 e,...)

@@
position r.p2 : script:ocaml(r.p1) { close(p1,p2) };
position r.p1;
expression e;
type T;
@@

- (T@p1 *@p2)
  e

---

Basically, it assumes that if the type and the * are more than one space
apart then there is something important there, and the cast is not
removed.

julia

Re: [Cocci] cocci: remove unnecessary casts of void * while avoiding casts with user or force ?

2017-08-28 Thread Julia Lawall

On Mon, 28 Aug 2017, Joe Perches wrote:

> A simple cocci script that removes unnecessary casts of
> a void * will also remove casts with __force or __user

Unfortunately, attributes are currently not supported inside casts.  This
can be done in a hackish way (possible false negatives) as follows:

---

@initialize:ocaml@
@@

let close (p1,p2) =
  let r = (List.hd p1).line_end in
  let l = (List.hd p2).line in
  let rc = (List.hd p1).col_end in
  let lc = (List.hd p2).col in
  r = l && lc = rc+1

@r@
position p1,p2;
expression f,e;
type T;
@@

f(..., // generalize this rule as needed
 (T@p1 *@p2)
 e,...)

@@
position r.p2 : script:ocaml(r.p1) { close(p1,p2) };
position r.p1;
expression e;
type T;
@@

- (T@p1 *@p2)
  e

---

Basically, it assumes that if the type and the * are more than one space
apart then there is something important there, and the cast is not
removed.

julia

[no subject]

2017-08-28 Thread Venkat Subbiah

Sup Linux


http://www.imr-asso.org/wp-content/uploads/innovation.php?corn=pks2ea81htmcx01ew



Venkat

[no subject]

2017-08-28 Thread Venkat Subbiah

Sup Linux


http://www.imr-asso.org/wp-content/uploads/innovation.php?corn=pks2ea81htmcx01ew



Venkat

Re: [PATCH v3 1/3] mfd: Add support for Cherry Trail Dollar Cove TI PMIC

2017-08-28 Thread Takashi Iwai

On Tue, 29 Aug 2017 00:31:15 +0200,
Rafael J. Wysocki wrote:
> 
> On Fri, Aug 25, 2017 at 3:44 PM, Takashi Iwai  wrote:
> > This patch adds the MFD driver for Dollar Cove (TI version) PMIC with
> > ACPI INT33F5 that is found on some Intel Cherry Trail devices.
> > The driver is based on the original work by Intel, found at:
> >   https://github.com/01org/ProductionKernelQuilts
> >
> > This is a minimal version for adding the basic resources.  Currently,
> > only ACPI PMIC opregion and the external power-button are used.
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=193891
> > Reviewed-by: Mika Westerberg 
> > Reviewed-by: Andy Shevchenko 
> > Signed-off-by: Takashi Iwai 
> 
> I need an ACK from Lee on this one.

Yeah, the MFD patch is prerequisite for patches 2 and 3, of course...

Lee, could you review the patch 1?


thanks,

Takashi

Re: [PATCH v3 1/3] mfd: Add support for Cherry Trail Dollar Cove TI PMIC

2017-08-28 Thread Takashi Iwai

On Tue, 29 Aug 2017 00:31:15 +0200,
Rafael J. Wysocki wrote:
> 
> On Fri, Aug 25, 2017 at 3:44 PM, Takashi Iwai  wrote:
> > This patch adds the MFD driver for Dollar Cove (TI version) PMIC with
> > ACPI INT33F5 that is found on some Intel Cherry Trail devices.
> > The driver is based on the original work by Intel, found at:
> >   https://github.com/01org/ProductionKernelQuilts
> >
> > This is a minimal version for adding the basic resources.  Currently,
> > only ACPI PMIC opregion and the external power-button are used.
> >
> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=193891
> > Reviewed-by: Mika Westerberg 
> > Reviewed-by: Andy Shevchenko 
> > Signed-off-by: Takashi Iwai 
> 
> I need an ACK from Lee on this one.

Yeah, the MFD patch is prerequisite for patches 2 and 3, of course...

Lee, could you review the patch 1?


thanks,

Takashi

Re: [PATCH] lsm_audit: use get_task_comm

2017-08-28 Thread Richard Guy Briggs

On 2017-08-28 17:54, Paul Moore wrote:
> On Mon, Aug 28, 2017 at 9:58 AM, Geliang Tang  wrote:
> > get_task_comm() copys the task's comm under the task_lock, it's safer
> > than directly using memcpy().
> >
> > Signed-off-by: Geliang Tang 
> > ---
> >  security/lsm_audit.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/security/lsm_audit.c b/security/lsm_audit.c
> > index 28d4c3a..555b1c4 100644
> > --- a/security/lsm_audit.c
> > +++ b/security/lsm_audit.c
> > @@ -221,7 +221,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
> >
> > audit_log_format(ab, " pid=%d comm=", task_tgid_nr(current));
> > -   audit_log_untrustedstring(ab, memcpy(comm, current->comm, 
> > sizeof(comm)));
> > +   audit_log_untrustedstring(ab, get_task_comm(comm, current));
> >
> > switch (a->type) {
> > case LSM_AUDIT_DATA_NONE:
> > @@ -312,7 +312,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > char comm[sizeof(tsk->comm)];
> > audit_log_format(ab, " opid=%d ocomm=", 
> > pid);
> > audit_log_untrustedstring(ab,
> > -   memcpy(comm, tsk->comm, sizeof(comm)));
> > +   get_task_comm(comm, tsk));
> 
> [NOTE: adding the linux-audit mailing list to this thread]

There was previously pushback about using get_task_comm() with its
locking, which is why in this particular location, a memcpy was chosen
instead.

This was done in:
5deeb5cece3f9b30c8129786726b9d02c412c8ca rgb 2015-04-14
("lsm: copy comm before calling audit_log to avoid race in string printing")

>From that commit:
Using get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm would incur potentially unacceptable overhead, considering that the 
value
can be influenced by userspace and therefore untrusted anyways.

> This isn't strictly a problem with this patch, but I think we should
> be able to get rid of the 'comm' variable in this if-block as simply
> reuse the 'comm' from the top of the function.  It would be nice to
> include that in this patch.
> 
> Other than that minor nit, this patch looks good to me; if you make
> that small change I'll merge it into the audit/next branch for the
> upcoming merge window.

So, I'd offer a NACK here.

> paul moore

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

Re: [PATCH] lsm_audit: use get_task_comm

2017-08-28 Thread Richard Guy Briggs

On 2017-08-28 17:54, Paul Moore wrote:
> On Mon, Aug 28, 2017 at 9:58 AM, Geliang Tang  wrote:
> > get_task_comm() copys the task's comm under the task_lock, it's safer
> > than directly using memcpy().
> >
> > Signed-off-by: Geliang Tang 
> > ---
> >  security/lsm_audit.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/security/lsm_audit.c b/security/lsm_audit.c
> > index 28d4c3a..555b1c4 100644
> > --- a/security/lsm_audit.c
> > +++ b/security/lsm_audit.c
> > @@ -221,7 +221,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
> >
> > audit_log_format(ab, " pid=%d comm=", task_tgid_nr(current));
> > -   audit_log_untrustedstring(ab, memcpy(comm, current->comm, 
> > sizeof(comm)));
> > +   audit_log_untrustedstring(ab, get_task_comm(comm, current));
> >
> > switch (a->type) {
> > case LSM_AUDIT_DATA_NONE:
> > @@ -312,7 +312,7 @@ static void dump_common_audit_data(struct audit_buffer 
> > *ab,
> > char comm[sizeof(tsk->comm)];
> > audit_log_format(ab, " opid=%d ocomm=", 
> > pid);
> > audit_log_untrustedstring(ab,
> > -   memcpy(comm, tsk->comm, sizeof(comm)));
> > +   get_task_comm(comm, tsk));
> 
> [NOTE: adding the linux-audit mailing list to this thread]

There was previously pushback about using get_task_comm() with its
locking, which is why in this particular location, a memcpy was chosen
instead.

This was done in:
5deeb5cece3f9b30c8129786726b9d02c412c8ca rgb 2015-04-14
("lsm: copy comm before calling audit_log to avoid race in string printing")

>From that commit:
Using get_task_comm() to get a copy while acquiring the task_lock to prevent
this and to prevent the result from being a mixture of old and new values of
comm would incur potentially unacceptable overhead, considering that the 
value
can be influenced by userspace and therefore untrusted anyways.

> This isn't strictly a problem with this patch, but I think we should
> be able to get rid of the 'comm' variable in this if-block as simply
> reuse the 'comm' from the top of the function.  It would be nice to
> include that in this patch.
> 
> Other than that minor nit, this patch looks good to me; if you make
> that small change I'll merge it into the audit/next branch for the
> upcoming merge window.

So, I'd offer a NACK here.

> paul moore

- RGB

--
Richard Guy Briggs 
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

Re: [PATCH 0/4] irda: move it to drivers/staging so we can delete it

2017-08-28 Thread Greg KH

On Mon, Aug 28, 2017 at 04:46:07PM -0700, Joe Perches wrote:
> On Mon, 2017-08-28 at 16:42 -0700, David Miller wrote:
> > From: Greg Kroah-Hartman 
> > Date: Sun, 27 Aug 2017 17:03:30 +0200
> > 
> > > The IRDA code has long been obsolete and broken.  So, to keep people
> > > from trying to use it, and to prevent people from having to maintain it,
> > > let's move it to drivers/staging/ so that we can delete it entirely from
> > > the kernel in a few releases.
> > 
> > No objection, I'll apply this to net-next, thanks Greg.
> 
> Still needs an update to MAINTAINERS.

Oops, forgot those directories, will send a follow-on patch for that.

greg k-h

Re: [PATCH 0/4] irda: move it to drivers/staging so we can delete it

2017-08-28 Thread Greg KH

On Mon, Aug 28, 2017 at 04:46:07PM -0700, Joe Perches wrote:
> On Mon, 2017-08-28 at 16:42 -0700, David Miller wrote:
> > From: Greg Kroah-Hartman 
> > Date: Sun, 27 Aug 2017 17:03:30 +0200
> > 
> > > The IRDA code has long been obsolete and broken.  So, to keep people
> > > from trying to use it, and to prevent people from having to maintain it,
> > > let's move it to drivers/staging/ so that we can delete it entirely from
> > > the kernel in a few releases.
> > 
> > No objection, I'll apply this to net-next, thanks Greg.
> 
> Still needs an update to MAINTAINERS.

Oops, forgot those directories, will send a follow-on patch for that.

greg k-h

Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman

On Mon, Aug 28, 2017 at 01:40:29PM -0600, Shuah Khan wrote:
> On 08/28/2017 02:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.12.10-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.12.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman

On Mon, Aug 28, 2017 at 01:40:29PM -0600, Shuah Khan wrote:
> On 08/28/2017 02:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.12.10-rc1.gz
> > or in the git tree and branch at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-4.12.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman

On Mon, Aug 28, 2017 at 05:11:03PM -0700, Guenter Roeck wrote:
> On 08/28/2017 01:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> 
> Build results:
>   total: 145 pass: 145 fail: 0
> Qemu test results:
>   total: 122 pass: 122 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Great, thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 4.12 00/99] 4.12.10-stable review

2017-08-28 Thread Greg Kroah-Hartman

On Mon, Aug 28, 2017 at 05:11:03PM -0700, Guenter Roeck wrote:
> On 08/28/2017 01:03 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.12.10 release.
> > There are 99 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Wed Aug 30 08:04:17 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> 
> Build results:
>   total: 145 pass: 145 fail: 0
> Qemu test results:
>   total: 122 pass: 122 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Great, thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH] Revert "xhci: Limit USB2 port wake support for AMD Promontory hosts"

2017-08-28 Thread Kai-Heng Feng

On Mon, Aug 28, 2017 at 6:14 PM, Mathias Nyman
 wrote:
> On 28.08.2017 12:29, Greg KH wrote:
>>
>> On Tue, Aug 22, 2017 at 05:14:47PM +0800, Kai-Heng Feng wrote:
>>>
>>> This reverts commit dec08194ffeccfa1cf085906b53d301930eae18f.
>>>
>>> Commit dec08194ffec ("xhci: Limit USB2 port wake support for AMD
>>> Promontory
>>> hosts") makes all high speed USB ports on ASUS PRIME B350M-A cease to
>>> function after enabling runtime PM.
>>>
>>> All boards with this chipsets will be affected, so revert the commit.
>>>
>>> Conflicts:
>>> drivers/usb/host/xhci-pci.c
>>> drivers/usb/host/xhci.h
>>
>>
>> Why are these "Conflicts:" lines here, you did fix up the issues, so
>> there shouldn't be any more conflicts.
>>
>> And if you revert this, don't we still have the original problem here?
>>
>
> Adding more people who were involved in the original patch.
>
> Users are now seeing the unresponsive USB2 ports with Promontory hosts.
> Is there any update on a better way to solve the original issue.
>
> To me a "dead" USB2 port seems like a much worse issue for a user
> than a BIOS disabled port waking up on plug/unplug (wake on
> connect/disconnect),
> so I'm myself in favor of doing this revert.

At least I can't find "Disable USB2" on my ASUS PRIME B350M-A, so the
new behavior is quite surprising.

>
> But there was a strong push from Promontory developers to get the original
> fix in,
> and I would like to get some comment from them before I do anything about
> it.

You looped them to the mail thread which I reported the regression two
weeks ago, and there is no response since then...

>
> Thanks
> -Mathias
>

[PATCH RESEND 2/2] dmaengine: sun6i: support V3s SoC variant

2017-08-28 Thread Icenowy Zheng

From: Icenowy Zheng 

Allwinner V3s has a DMA engine similar to the ones from A31, but with
fewer channels and DRQs.

Add support for it.

Signed-off-by: Icenowy Zheng 
Acked-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/dma/sun6i-dma.txt |  1 +
 drivers/dma/sun6i-dma.c | 13 +
 2 files changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/sun6i-dma.txt 
b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
index 6b267045f522..98fbe1a5c6dd 100644
--- a/Documentation/devicetree/bindings/dma/sun6i-dma.txt
+++ b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
@@ -9,6 +9,7 @@ Required properties:
  "allwinner,sun8i-a23-dma"
  "allwinner,sun8i-a83t-dma"
  "allwinner,sun8i-h3-dma"
+ "allwinner,sun8i-v3s-dma"
 - reg: Should contain the registers base address and length
 - interrupts:  Should contain a reference to the interrupt used by this device
 - clocks:  Should contain a reference to the parent AHB clock
diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index 252b59c1d1d5..bcd496edc70f 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -1040,11 +1040,24 @@ static struct sun6i_dma_config sun8i_h3_dma_cfg = {
.nr_max_vchans   = 34,
 };
 
+/*
+ * The V3s have only 8 physical channels, a maximum DRQ port id of 23,
+ * and a total of 24 usable source and destination endpoints.
+ */
+
+static struct sun6i_dma_config sun8i_v3s_dma_cfg = {
+   .nr_max_channels = 8,
+   .nr_max_requests = 23,
+   .nr_max_vchans   = 24,
+   .gate_needed = true,
+};
+
 static const struct of_device_id sun6i_dma_match[] = {
{ .compatible = "allwinner,sun6i-a31-dma", .data = _a31_dma_cfg },
{ .compatible = "allwinner,sun8i-a23-dma", .data = _a23_dma_cfg },
{ .compatible = "allwinner,sun8i-a83t-dma", .data = _a83t_dma_cfg 
},
{ .compatible = "allwinner,sun8i-h3-dma", .data = _h3_dma_cfg },
+   { .compatible = "allwinner,sun8i-v3s-dma", .data = _v3s_dma_cfg },
{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, sun6i_dma_match);
-- 
2.13.5

Re: [PATCH] Revert "xhci: Limit USB2 port wake support for AMD Promontory hosts"

2017-08-28 Thread Kai-Heng Feng

On Mon, Aug 28, 2017 at 6:14 PM, Mathias Nyman
 wrote:
> On 28.08.2017 12:29, Greg KH wrote:
>>
>> On Tue, Aug 22, 2017 at 05:14:47PM +0800, Kai-Heng Feng wrote:
>>>
>>> This reverts commit dec08194ffeccfa1cf085906b53d301930eae18f.
>>>
>>> Commit dec08194ffec ("xhci: Limit USB2 port wake support for AMD
>>> Promontory
>>> hosts") makes all high speed USB ports on ASUS PRIME B350M-A cease to
>>> function after enabling runtime PM.
>>>
>>> All boards with this chipsets will be affected, so revert the commit.
>>>
>>> Conflicts:
>>> drivers/usb/host/xhci-pci.c
>>> drivers/usb/host/xhci.h
>>
>>
>> Why are these "Conflicts:" lines here, you did fix up the issues, so
>> there shouldn't be any more conflicts.
>>
>> And if you revert this, don't we still have the original problem here?
>>
>
> Adding more people who were involved in the original patch.
>
> Users are now seeing the unresponsive USB2 ports with Promontory hosts.
> Is there any update on a better way to solve the original issue.
>
> To me a "dead" USB2 port seems like a much worse issue for a user
> than a BIOS disabled port waking up on plug/unplug (wake on
> connect/disconnect),
> so I'm myself in favor of doing this revert.

At least I can't find "Disable USB2" on my ASUS PRIME B350M-A, so the
new behavior is quite surprising.

>
> But there was a strong push from Promontory developers to get the original
> fix in,
> and I would like to get some comment from them before I do anything about
> it.

You looped them to the mail thread which I reported the regression two
weeks ago, and there is no response since then...

>
> Thanks
> -Mathias
>

[PATCH RESEND 2/2] dmaengine: sun6i: support V3s SoC variant

2017-08-28 Thread Icenowy Zheng

From: Icenowy Zheng 

Allwinner V3s has a DMA engine similar to the ones from A31, but with
fewer channels and DRQs.

Add support for it.

Signed-off-by: Icenowy Zheng 
Acked-by: Chen-Yu Tsai 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/dma/sun6i-dma.txt |  1 +
 drivers/dma/sun6i-dma.c | 13 +
 2 files changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/dma/sun6i-dma.txt 
b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
index 6b267045f522..98fbe1a5c6dd 100644
--- a/Documentation/devicetree/bindings/dma/sun6i-dma.txt
+++ b/Documentation/devicetree/bindings/dma/sun6i-dma.txt
@@ -9,6 +9,7 @@ Required properties:
  "allwinner,sun8i-a23-dma"
  "allwinner,sun8i-a83t-dma"
  "allwinner,sun8i-h3-dma"
+ "allwinner,sun8i-v3s-dma"
 - reg: Should contain the registers base address and length
 - interrupts:  Should contain a reference to the interrupt used by this device
 - clocks:  Should contain a reference to the parent AHB clock
diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index 252b59c1d1d5..bcd496edc70f 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -1040,11 +1040,24 @@ static struct sun6i_dma_config sun8i_h3_dma_cfg = {
.nr_max_vchans   = 34,
 };
 
+/*
+ * The V3s have only 8 physical channels, a maximum DRQ port id of 23,
+ * and a total of 24 usable source and destination endpoints.
+ */
+
+static struct sun6i_dma_config sun8i_v3s_dma_cfg = {
+   .nr_max_channels = 8,
+   .nr_max_requests = 23,
+   .nr_max_vchans   = 24,
+   .gate_needed = true,
+};
+
 static const struct of_device_id sun6i_dma_match[] = {
{ .compatible = "allwinner,sun6i-a31-dma", .data = _a31_dma_cfg },
{ .compatible = "allwinner,sun8i-a23-dma", .data = _a23_dma_cfg },
{ .compatible = "allwinner,sun8i-a83t-dma", .data = _a83t_dma_cfg 
},
{ .compatible = "allwinner,sun8i-h3-dma", .data = _h3_dma_cfg },
+   { .compatible = "allwinner,sun8i-v3s-dma", .data = _v3s_dma_cfg },
{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, sun6i_dma_match);
-- 
2.13.5

[PATCH RESEND 1/2] dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk

2017-08-28 Thread Icenowy Zheng

From: Icenowy Zheng 

Originally we enable a special gate bit when the compatible indicates
A23/33.

But according to BSP sources and user manuals, more SoCs will need this
gate bit.

So make it a common quirk configured in the config struct.

Signed-off-by: Icenowy Zheng 
Reviewed-by: Chen-Yu Tsai 
---
 drivers/dma/sun6i-dma.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index a2358780ab2c..252b59c1d1d5 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -101,6 +101,17 @@ struct sun6i_dma_config {
u32 nr_max_channels;
u32 nr_max_requests;
u32 nr_max_vchans;
+   /*
+* In the datasheets/user manuals of newer Allwinner SoCs, a special
+* bit (bit 2 at register 0x20) is present.
+* It's named "DMA MCLK interface circuit auto gating bit" in the
+* documents, and the footnote of this register says that this bit
+* should be set up when initializing the DMA controller.
+* Allwinner A23/A33 user manuals do not have this bit documented,
+* however these SoCs really have and need this bit, as seen in the
+* BSP kernel source code.
+*/
+   bool gate_needed;
 };
 
 /*
@@ -1009,6 +1020,7 @@ static struct sun6i_dma_config sun8i_a23_dma_cfg = {
.nr_max_channels = 8,
.nr_max_requests = 24,
.nr_max_vchans   = 37,
+   .gate_needed = true,
 };
 
 static struct sun6i_dma_config sun8i_a83t_dma_cfg = {
@@ -1174,13 +1186,7 @@ static int sun6i_dma_probe(struct platform_device *pdev)
goto err_dma_unregister;
}
 
-   /*
-* sun8i variant requires us to toggle a dma gating register,
-* as seen in Allwinner's SDK. This register is not documented
-* in the A23 user manual.
-*/
-   if (of_device_is_compatible(pdev->dev.of_node,
-   "allwinner,sun8i-a23-dma"))
+   if (sdc->cfg->gate_needed)
writel(SUN8I_DMA_GATE_ENABLE, sdc->base + SUN8I_DMA_GATE);
 
return 0;
-- 
2.13.5

[PATCH RESEND 1/2] dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk

2017-08-28 Thread Icenowy Zheng

From: Icenowy Zheng 

Originally we enable a special gate bit when the compatible indicates
A23/33.

But according to BSP sources and user manuals, more SoCs will need this
gate bit.

So make it a common quirk configured in the config struct.

Signed-off-by: Icenowy Zheng 
Reviewed-by: Chen-Yu Tsai 
---
 drivers/dma/sun6i-dma.c | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/sun6i-dma.c b/drivers/dma/sun6i-dma.c
index a2358780ab2c..252b59c1d1d5 100644
--- a/drivers/dma/sun6i-dma.c
+++ b/drivers/dma/sun6i-dma.c
@@ -101,6 +101,17 @@ struct sun6i_dma_config {
u32 nr_max_channels;
u32 nr_max_requests;
u32 nr_max_vchans;
+   /*
+* In the datasheets/user manuals of newer Allwinner SoCs, a special
+* bit (bit 2 at register 0x20) is present.
+* It's named "DMA MCLK interface circuit auto gating bit" in the
+* documents, and the footnote of this register says that this bit
+* should be set up when initializing the DMA controller.
+* Allwinner A23/A33 user manuals do not have this bit documented,
+* however these SoCs really have and need this bit, as seen in the
+* BSP kernel source code.
+*/
+   bool gate_needed;
 };
 
 /*
@@ -1009,6 +1020,7 @@ static struct sun6i_dma_config sun8i_a23_dma_cfg = {
.nr_max_channels = 8,
.nr_max_requests = 24,
.nr_max_vchans   = 37,
+   .gate_needed = true,
 };
 
 static struct sun6i_dma_config sun8i_a83t_dma_cfg = {
@@ -1174,13 +1186,7 @@ static int sun6i_dma_probe(struct platform_device *pdev)
goto err_dma_unregister;
}
 
-   /*
-* sun8i variant requires us to toggle a dma gating register,
-* as seen in Allwinner's SDK. This register is not documented
-* in the A23 user manual.
-*/
-   if (of_device_is_compatible(pdev->dev.of_node,
-   "allwinner,sun8i-a23-dma"))
+   if (sdc->cfg->gate_needed)
writel(SUN8I_DMA_GATE_ENABLE, sdc->base + SUN8I_DMA_GATE);
 
return 0;
-- 
2.13.5

[PATCH RESEND 0/2] Allwinner V3s DMA support

2017-08-28 Thread Icenowy Zheng

This is a dedicated patchset of Allwinner V3s DMA support, which used
to be part of the audio codec support patchset.

It's a derivation of the DMA part of v3 of the codec patchset.

Icenowy Zheng (2):
  dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk
  dmaengine: sun6i: support V3s SoC variant

 .../devicetree/bindings/dma/sun6i-dma.txt  |  1 +
 drivers/dma/sun6i-dma.c| 33 +-
 2 files changed, 27 insertions(+), 7 deletions(-)

-- 
2.13.5

[PATCH RESEND 0/2] Allwinner V3s DMA support

2017-08-28 Thread Icenowy Zheng

This is a dedicated patchset of Allwinner V3s DMA support, which used
to be part of the audio codec support patchset.

It's a derivation of the DMA part of v3 of the codec patchset.

Icenowy Zheng (2):
  dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk
  dmaengine: sun6i: support V3s SoC variant

 .../devicetree/bindings/dma/sun6i-dma.txt  |  1 +
 drivers/dma/sun6i-dma.c| 33 +-
 2 files changed, 27 insertions(+), 7 deletions(-)

-- 
2.13.5

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-28 Thread Darrick J. Wong

On Mon, Aug 28, 2017 at 02:57:14PM -0700, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 2:49 PM, Darrick J. Wong
>  wrote:
> > On Mon, Aug 28, 2017 at 02:34:56PM -0700, Kees Cook wrote:
> >> From: David Windsor 
> >>
> >> The XFS inline inode data, stored in struct xfs_inode_t field
> >> i_df.if_u2.if_inline_data and therefore contained in the xfs_inode slab
> >> cache, needs to be copied to/from userspace.
> >>
> >> cache object allocation:
> >> fs/xfs/xfs_icache.c:
> >> xfs_inode_alloc(...):
> >> ...
> >> ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
> >>
> >> fs/xfs/libxfs/xfs_inode_fork.c:
> >> xfs_init_local_fork(...):
> >> ...
> >> if (mem_size <= sizeof(ifp->if_u2.if_inline_data))
> >> ifp->if_u1.if_data = ifp->if_u2.if_inline_data;
> >
> > Hmm, what happens when mem_size > sizeof(if_inline_data)?  A slab object
> > will be allocated for ifp->if_u1.if_data which can then be used for
> > readlink in the same manner as the example usage trace below.  Does
> > that allocated object have a need for a usercopy annotation like
> > the one we're adding for if_inline_data?  Or is that already covered
> > elsewhere?
> 
> Yeah, the xfs helper kmem_alloc() is used in the other case, which
> ultimately boils down to a call to kmalloc(), which is entirely
> whitelisted by an earlier patch in the series:
> 
> https://lkml.org/lkml/2017/8/28/1026

Ah.  It would've been helpful to have the first three patches cc'd to
the xfs list.  So basically this series establishes the ability to set
regions within a slab object into which copy_to_user can copy memory
contents, and vice versa.  Have you seen any runtime performance impact?
The overhead looks like it ought to be minimal.

> (It's possible that at some future time we can start segregating
> kernel-only kmallocs from usercopy-able kmallocs, but for now, there
> are no plans for this.)

A pity.  It would be interesting to create no-usercopy versions of the
kmalloc-* slabs and see how much of XFS' memory consumption never
touches userspace buffers. :)

--D

> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 15/30] xfs: Define usercopy region in xfs_inode slab cache

2017-08-28 Thread Darrick J. Wong

On Mon, Aug 28, 2017 at 02:57:14PM -0700, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 2:49 PM, Darrick J. Wong
>  wrote:
> > On Mon, Aug 28, 2017 at 02:34:56PM -0700, Kees Cook wrote:
> >> From: David Windsor 
> >>
> >> The XFS inline inode data, stored in struct xfs_inode_t field
> >> i_df.if_u2.if_inline_data and therefore contained in the xfs_inode slab
> >> cache, needs to be copied to/from userspace.
> >>
> >> cache object allocation:
> >> fs/xfs/xfs_icache.c:
> >> xfs_inode_alloc(...):
> >> ...
> >> ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
> >>
> >> fs/xfs/libxfs/xfs_inode_fork.c:
> >> xfs_init_local_fork(...):
> >> ...
> >> if (mem_size <= sizeof(ifp->if_u2.if_inline_data))
> >> ifp->if_u1.if_data = ifp->if_u2.if_inline_data;
> >
> > Hmm, what happens when mem_size > sizeof(if_inline_data)?  A slab object
> > will be allocated for ifp->if_u1.if_data which can then be used for
> > readlink in the same manner as the example usage trace below.  Does
> > that allocated object have a need for a usercopy annotation like
> > the one we're adding for if_inline_data?  Or is that already covered
> > elsewhere?
> 
> Yeah, the xfs helper kmem_alloc() is used in the other case, which
> ultimately boils down to a call to kmalloc(), which is entirely
> whitelisted by an earlier patch in the series:
> 
> https://lkml.org/lkml/2017/8/28/1026

Ah.  It would've been helpful to have the first three patches cc'd to
the xfs list.  So basically this series establishes the ability to set
regions within a slab object into which copy_to_user can copy memory
contents, and vice versa.  Have you seen any runtime performance impact?
The overhead looks like it ought to be minimal.

> (It's possible that at some future time we can start segregating
> kernel-only kmallocs from usercopy-able kmallocs, but for now, there
> are no plans for this.)

A pity.  It would be interesting to create no-usercopy versions of the
kmalloc-* slabs and see how much of XFS' memory consumption never
touches userspace buffers. :)

--D

> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 2/3] sched/fair: use util_est in LB

2017-08-28 Thread Pavan Kondeti

On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> When the scheduler looks at the CPU utlization, the current PELT value
> for a CPU is returned straight away. In certain scenarios this can have
> undesired side effects on task placement.
>



> +/**
> + * cpu_util_est: estimated utilization for the specified CPU
> + * @cpu: the CPU to get the estimated utilization for
> + *
> + * The estimated utilization of a CPU is defined to be the maximum between 
> its
> + * PELT's utilization and the sum of the estimated utilization of the tasks
> + * currently RUNNABLE on that CPU.
> + *
> + * This allows to properly represent the expected utilization of a CPU which
> + * has just got a big task running since a long sleep period. At the same 
> time
> + * however it preserves the benefits of the "blocked load" in describing the
> + * potential for other tasks waking up on the same CPU.
> + *
> + * Return: the estimated utlization for the specified CPU
> + */
> +static inline unsigned long cpu_util_est(int cpu)
> +{
> +   struct sched_avg *sa = _rq(cpu)->cfs.avg;
> +   unsigned long util = cpu_util(cpu);
> +
> +   if (!sched_feat(UTIL_EST))
> +   return util;
> +
> +   return max(util, util_est(sa, UTIL_EST_LAST));
> +}
> +
>  static inline int task_util(struct task_struct *p)
>  {
> return p->se.avg.util_avg;
> @@ -6007,11 +6033,19 @@ static int cpu_util_wake(int cpu, struct task_struct 
> *p)
>
> /* Task has no contribution or is new */
> if (cpu != task_cpu(p) || !p->se.avg.last_update_time)
> -   return cpu_util(cpu);
> +   return cpu_util_est(cpu);
>
> capacity = capacity_orig_of(cpu);
> util = max_t(long, cpu_rq(cpu)->cfs.avg.util_avg - task_util(p), 0);
>
> +   /*
> +* Estimated utilization tracks only tasks already enqueued, but still
> +* sometimes can return a bigger value than PELT, for example when the
> +* blocked load is negligible wrt the estimated utilization of the
> +* already enqueued tasks.
> +*/
> +   util = max_t(long, util, cpu_util_est(cpu));
> +

We are supposed to discount the task's util from its CPU. But the
cpu_util_est() can potentially return cpu_util() which includes the
task's utilization.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project

Re: [RFC 2/3] sched/fair: use util_est in LB

2017-08-28 Thread Pavan Kondeti

On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> When the scheduler looks at the CPU utlization, the current PELT value
> for a CPU is returned straight away. In certain scenarios this can have
> undesired side effects on task placement.
>



> +/**
> + * cpu_util_est: estimated utilization for the specified CPU
> + * @cpu: the CPU to get the estimated utilization for
> + *
> + * The estimated utilization of a CPU is defined to be the maximum between 
> its
> + * PELT's utilization and the sum of the estimated utilization of the tasks
> + * currently RUNNABLE on that CPU.
> + *
> + * This allows to properly represent the expected utilization of a CPU which
> + * has just got a big task running since a long sleep period. At the same 
> time
> + * however it preserves the benefits of the "blocked load" in describing the
> + * potential for other tasks waking up on the same CPU.
> + *
> + * Return: the estimated utlization for the specified CPU
> + */
> +static inline unsigned long cpu_util_est(int cpu)
> +{
> +   struct sched_avg *sa = _rq(cpu)->cfs.avg;
> +   unsigned long util = cpu_util(cpu);
> +
> +   if (!sched_feat(UTIL_EST))
> +   return util;
> +
> +   return max(util, util_est(sa, UTIL_EST_LAST));
> +}
> +
>  static inline int task_util(struct task_struct *p)
>  {
> return p->se.avg.util_avg;
> @@ -6007,11 +6033,19 @@ static int cpu_util_wake(int cpu, struct task_struct 
> *p)
>
> /* Task has no contribution or is new */
> if (cpu != task_cpu(p) || !p->se.avg.last_update_time)
> -   return cpu_util(cpu);
> +   return cpu_util_est(cpu);
>
> capacity = capacity_orig_of(cpu);
> util = max_t(long, cpu_rq(cpu)->cfs.avg.util_avg - task_util(p), 0);
>
> +   /*
> +* Estimated utilization tracks only tasks already enqueued, but still
> +* sometimes can return a bigger value than PELT, for example when the
> +* blocked load is negligible wrt the estimated utilization of the
> +* already enqueued tasks.
> +*/
> +   util = max_t(long, util, cpu_util_est(cpu));
> +

We are supposed to discount the task's util from its CPU. But the
cpu_util_est() can potentially return cpu_util() which includes the
task's utilization.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project

Re: [PATCH net-next] hinic: don't build the module by default

2017-08-28 Thread David Miller

From: Vitaly Kuznetsov 
Date: Mon, 28 Aug 2017 15:16:05 +0200

> We probably don't want to enable code supporting particular hardware by
> default e.g. when someone does 'make defconfig'. Other ethernet modules
> don't do it.
> 
> Signed-off-by: Vitaly Kuznetsov 

Applied, thanks.

Re: [PATCH net-next] hinic: don't build the module by default

2017-08-28 Thread David Miller

From: Vitaly Kuznetsov 
Date: Mon, 28 Aug 2017 15:16:05 +0200

> We probably don't want to enable code supporting particular hardware by
> default e.g. when someone does 'make defconfig'. Other ethernet modules
> don't do it.
> 
> Signed-off-by: Vitaly Kuznetsov 

Applied, thanks.

Re: [PATCH net-next v2 00/10] net: dsa: add generic debugfs interface

2017-08-28 Thread David Miller

From: Vivien Didelot 
Date: Mon, 28 Aug 2017 15:17:38 -0400

> This patch series adds a generic debugfs interface for the DSA
> framework, so that all switch devices benefit from it, e.g. Marvell,
> Broadcom, Microchip or any other DSA driver.

I've been thinking this over and I agree with the feedback given that
debugfs really isn't appropriate for this.

Please create a DSA device class, and hang these values under
appropriate sysfs device nodes that can be easily found via
/sys/class/dsa/ just as easily as they would be /sys/kernel/debug/dsa/

You really intend these values to be consistent across DSA devices,
and you don't intend to go willy-nilly changig these exported values
arbitrarily over time.  That's what debugfs is for, throw-away
stuff.

So please make these proper device sysfs attributes rather than
debugfs.

Thank you.

Re: [PATCH net-next v2 00/10] net: dsa: add generic debugfs interface

2017-08-28 Thread David Miller

From: Vivien Didelot 
Date: Mon, 28 Aug 2017 15:17:38 -0400

> This patch series adds a generic debugfs interface for the DSA
> framework, so that all switch devices benefit from it, e.g. Marvell,
> Broadcom, Microchip or any other DSA driver.

I've been thinking this over and I agree with the feedback given that
debugfs really isn't appropriate for this.

Please create a DSA device class, and hang these values under
appropriate sysfs device nodes that can be easily found via
/sys/class/dsa/ just as easily as they would be /sys/kernel/debug/dsa/

You really intend these values to be consistent across DSA devices,
and you don't intend to go willy-nilly changig these exported values
arbitrarily over time.  That's what debugfs is for, throw-away
stuff.

So please make these proper device sysfs attributes rather than
debugfs.

Thank you.

Re: [RFC 1/3] sched/fair: add util_est on top of PELT

2017-08-28 Thread Pavan Kondeti

Hi Patrick,

On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> The util_avg signal computed by PELT is too variable for some use-cases.
> For example, a big task waking up after a long sleep period will have its
> utilization almost completely decayed. This introduces some latency before
> schedutil will be able to pick the best frequency to run a task.
>
> The same issue can affect task placement. Indeed, since the task
> utilization is already decayed at wakeup, when the task is enqueued in a
> CPU, this can results in a CPU running a big task as being temporarily
> represented as being almost empty. This leads to a race condition where
> other tasks can be potentially allocated on a CPU which just started to run
> a big task which slept for a relatively long period.
>
> Moreover, the utilization of a task is, by PELT definition, a continuously
> changing metrics. This contributes in making almost instantly outdated some
> decisions based on the value of the PELT's utilization.
>
> For all these reasons, a more stable signal could probably do a better job
> of representing the expected/estimated utilization of a SE/RQ. Such a
> signal can be easily created on top of PELT by still using it as an
> estimator which produces values to be aggregated once meaningful events
> happens.
>
> This patch adds a simple implementation of util_est, a new signal built on
> top of PELT's util_avg where:
>
> util_est(se) = max(se::util_avg, f(se::util_avg@dequeue_times))
>

I don't see any wrapper function in this patch that implements this
signal. You want to use this signal in the task placement path as a
replacement of task_util(), right?

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project

Re: [RFC 1/3] sched/fair: add util_est on top of PELT

2017-08-28 Thread Pavan Kondeti

Hi Patrick,

On Fri, Aug 25, 2017 at 3:50 PM, Patrick Bellasi
 wrote:
> The util_avg signal computed by PELT is too variable for some use-cases.
> For example, a big task waking up after a long sleep period will have its
> utilization almost completely decayed. This introduces some latency before
> schedutil will be able to pick the best frequency to run a task.
>
> The same issue can affect task placement. Indeed, since the task
> utilization is already decayed at wakeup, when the task is enqueued in a
> CPU, this can results in a CPU running a big task as being temporarily
> represented as being almost empty. This leads to a race condition where
> other tasks can be potentially allocated on a CPU which just started to run
> a big task which slept for a relatively long period.
>
> Moreover, the utilization of a task is, by PELT definition, a continuously
> changing metrics. This contributes in making almost instantly outdated some
> decisions based on the value of the PELT's utilization.
>
> For all these reasons, a more stable signal could probably do a better job
> of representing the expected/estimated utilization of a SE/RQ. Such a
> signal can be easily created on top of PELT by still using it as an
> estimator which produces values to be aggregated once meaningful events
> happens.
>
> This patch adds a simple implementation of util_est, a new signal built on
> top of PELT's util_avg where:
>
> util_est(se) = max(se::util_avg, f(se::util_avg@dequeue_times))
>

I don't see any wrapper function in this patch that implements this
signal. You want to use this signal in the task placement path as a
replacement of task_util(), right?

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
Linux Foundation Collaborative Project

Re: iov_iter_pipe warning.

2017-08-28 Thread Darrick J. Wong

On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote:
> On Mon, Aug 07, 2017 at 04:18:18PM -0400, Dave Jones wrote:
>  > On Fri, Apr 28, 2017 at 06:20:25PM +0100, Al Viro wrote:
>  >  > On Fri, Apr 28, 2017 at 12:50:24PM -0400, Dave Jones wrote:
>  >  > > currently running v4.11-rc8-75-gf83246089ca0
>  >  > > 
>  >  > > sunrpc bit is for the other unrelated problem I'm chasing.
>  >  > > 
>  >  > > note also, I saw the backtrace without the fs/splice.c changes.
>  >  > 
>  >  > Interesting...  Could you add this and see if that triggers?
>  >  > 
>  >  > diff --git a/fs/splice.c b/fs/splice.c
>  >  > index 540c4a44756c..12a12d9c313f 100644
>  >  > --- a/fs/splice.c
>  >  > +++ b/fs/splice.c
>  >  > @@ -306,6 +306,9 @@ ssize_t generic_file_splice_read(struct file *in, 
> loff_t *ppos,
>  >  > kiocb.ki_pos = *ppos;
>  >  > ret = call_read_iter(in, , );
>  >  > if (ret > 0) {
>  >  > +   if (WARN_ON(iov_iter_count() != len - ret))
>  >  > +   printk(KERN_ERR "ops %p: was %zd, left %zd, 
> returned %d\n",
>  >  > +   in->f_op, len, iov_iter_count(), 
> ret);
>  >  > *ppos = kiocb.ki_pos;
>  >  > file_accessed(in);
>  >  > } else if (ret < 0) {
>  > 
>  > Hey Al,
>  >  Due to a git stash screw up on my part, I've had this leftover WARN_ON
>  > in my tree for the last couple months. (That screw-up might turn out to be
>  > serendipitous if this is a real bug..)
>  > 
>  > Today I decided to change things up and beat up on xfs for a change, and
>  > was able to trigger this again.
>  > 
>  > Is this check no longer valid, or am I triggering the same bug we were 
> chased
>  > down in nfs, but now in xfs ?  (None of the other detritus from that 
> debugging
>  > back in April made it, just those three lines above).
> 
> Revisiting this. I went back and dug out some of the other debug diffs [1]
> from that old thread.
> 
> I can easily trigger this spew on xfs.
> 
> 
> WARNING: CPU: 1 PID: 2251 at fs/splice.c:292 test_it+0xd4/0x1d0
> CPU: 1 PID: 2251 Comm: trinity-c42 Not tainted 4.13.0-rc7-think+ #1 
> task: 880459173a40 task.stack: 88044f7d
> RIP: 0010:test_it+0xd4/0x1d0
> RSP: 0018:88044f7d7878 EFLAGS: 00010283
> RAX:  RBX: 88044f44b968 RCX: 81511ea0
> RDX: 0003 RSI: dc00 RDI: 88044f44ba68
> RBP: 88044f7d78c8 R08: 88046b218ec0 R09: 
> R10: 88044f7d7518 R11:  R12: 1000
> R13: 0001 R14:  R15: 0001
> FS:  7fdbc09b2700() GS:88046b20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000459e1d000 CR4: 001406e0
> Call Trace:
>  generic_file_splice_read+0x414/0x4e0
>  ? opipe_prep.part.14+0x180/0x180
>  ? lockdep_init_map+0xb2/0x2b0
>  ? rw_verify_area+0x65/0x150
>  do_splice_to+0xab/0xc0
>  splice_direct_to_actor+0x1f5/0x540
>  ? generic_pipe_buf_nosteal+0x10/0x10
>  ? do_splice_to+0xc0/0xc0
>  ? rw_verify_area+0x9d/0x150
>  do_splice_direct+0x1b9/0x230
>  ? splice_direct_to_actor+0x540/0x540
>  ? __sb_start_write+0x164/0x1c0
>  ? do_sendfile+0x7b3/0x840
>  do_sendfile+0x428/0x840
>  ? do_compat_pwritev64+0xb0/0xb0
>  ? __might_sleep+0x72/0xe0
>  ? kasan_check_write+0x14/0x20
>  SyS_sendfile64+0xa4/0x120
>  ? SyS_sendfile+0x150/0x150
>  ? mark_held_locks+0x23/0xb0
>  ? do_syscall_64+0xc0/0x3e0
>  ? SyS_sendfile+0x150/0x150
>  do_syscall_64+0x1bc/0x3e0
>  ? syscall_return_slowpath+0x240/0x240
>  ? mark_held_locks+0x23/0xb0
>  ? return_from_SYSCALL_64+0x2d/0x7a
>  ? trace_hardirqs_on_caller+0x182/0x260
>  ? trace_hardirqs_on_thunk+0x1a/0x1c
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7fdbc02dd219
> RSP: 002b:7ffc5024fa48 EFLAGS: 0246
>  ORIG_RAX: 0028
> RAX: ffda RBX: 0028 RCX: 7fdbc02dd219
> RDX: 7fdbbe348000 RSI: 0011 RDI: 0015
> RBP: 7ffc5024faf0 R08: 006d R09: 0094e82f2c730a50
> R10: 1000 R11: 0246 R12: 0002
> R13: 7fdbc0885058 R14: 7fdbc09b2698 R15: 7fdbc0885000
> ---[ end trace a5847ef0f7be7e20 ]---
> asked to read 4096, claims to have read 1
> actual size of data in pipe 4096 
> [0:4096]
> f_op: a058c920, f_flags: 49154, pos: 0/1, size: 0
> 
> 
> I'm still trying to narrow down an exact reproducer, but it seems having
> trinity do a combination of sendfile & writev, with pipes and regular
> files as fd's is the best repro.
> 
> Is this a real problem, or am I chasing ghosts ?  That it doesn't happen
> on ext4 or btrfs is making me wonder...

 I haven't heard of any problems w/ directio xfs lately, but OTOH
I think it's the only filesystem that uses iomap_dio_rw, which would
explain why ext4/btrfs don't have this problem.

Granted that's idle speculation; is there a

Re: iov_iter_pipe warning.

2017-08-28 Thread Darrick J. Wong

On Mon, Aug 28, 2017 at 04:31:30PM -0400, Dave Jones wrote:
> On Mon, Aug 07, 2017 at 04:18:18PM -0400, Dave Jones wrote:
>  > On Fri, Apr 28, 2017 at 06:20:25PM +0100, Al Viro wrote:
>  >  > On Fri, Apr 28, 2017 at 12:50:24PM -0400, Dave Jones wrote:
>  >  > > currently running v4.11-rc8-75-gf83246089ca0
>  >  > > 
>  >  > > sunrpc bit is for the other unrelated problem I'm chasing.
>  >  > > 
>  >  > > note also, I saw the backtrace without the fs/splice.c changes.
>  >  > 
>  >  > Interesting...  Could you add this and see if that triggers?
>  >  > 
>  >  > diff --git a/fs/splice.c b/fs/splice.c
>  >  > index 540c4a44756c..12a12d9c313f 100644
>  >  > --- a/fs/splice.c
>  >  > +++ b/fs/splice.c
>  >  > @@ -306,6 +306,9 @@ ssize_t generic_file_splice_read(struct file *in, 
> loff_t *ppos,
>  >  > kiocb.ki_pos = *ppos;
>  >  > ret = call_read_iter(in, , );
>  >  > if (ret > 0) {
>  >  > +   if (WARN_ON(iov_iter_count() != len - ret))
>  >  > +   printk(KERN_ERR "ops %p: was %zd, left %zd, 
> returned %d\n",
>  >  > +   in->f_op, len, iov_iter_count(), 
> ret);
>  >  > *ppos = kiocb.ki_pos;
>  >  > file_accessed(in);
>  >  > } else if (ret < 0) {
>  > 
>  > Hey Al,
>  >  Due to a git stash screw up on my part, I've had this leftover WARN_ON
>  > in my tree for the last couple months. (That screw-up might turn out to be
>  > serendipitous if this is a real bug..)
>  > 
>  > Today I decided to change things up and beat up on xfs for a change, and
>  > was able to trigger this again.
>  > 
>  > Is this check no longer valid, or am I triggering the same bug we were 
> chased
>  > down in nfs, but now in xfs ?  (None of the other detritus from that 
> debugging
>  > back in April made it, just those three lines above).
> 
> Revisiting this. I went back and dug out some of the other debug diffs [1]
> from that old thread.
> 
> I can easily trigger this spew on xfs.
> 
> 
> WARNING: CPU: 1 PID: 2251 at fs/splice.c:292 test_it+0xd4/0x1d0
> CPU: 1 PID: 2251 Comm: trinity-c42 Not tainted 4.13.0-rc7-think+ #1 
> task: 880459173a40 task.stack: 88044f7d
> RIP: 0010:test_it+0xd4/0x1d0
> RSP: 0018:88044f7d7878 EFLAGS: 00010283
> RAX:  RBX: 88044f44b968 RCX: 81511ea0
> RDX: 0003 RSI: dc00 RDI: 88044f44ba68
> RBP: 88044f7d78c8 R08: 88046b218ec0 R09: 
> R10: 88044f7d7518 R11:  R12: 1000
> R13: 0001 R14:  R15: 0001
> FS:  7fdbc09b2700() GS:88046b20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000459e1d000 CR4: 001406e0
> Call Trace:
>  generic_file_splice_read+0x414/0x4e0
>  ? opipe_prep.part.14+0x180/0x180
>  ? lockdep_init_map+0xb2/0x2b0
>  ? rw_verify_area+0x65/0x150
>  do_splice_to+0xab/0xc0
>  splice_direct_to_actor+0x1f5/0x540
>  ? generic_pipe_buf_nosteal+0x10/0x10
>  ? do_splice_to+0xc0/0xc0
>  ? rw_verify_area+0x9d/0x150
>  do_splice_direct+0x1b9/0x230
>  ? splice_direct_to_actor+0x540/0x540
>  ? __sb_start_write+0x164/0x1c0
>  ? do_sendfile+0x7b3/0x840
>  do_sendfile+0x428/0x840
>  ? do_compat_pwritev64+0xb0/0xb0
>  ? __might_sleep+0x72/0xe0
>  ? kasan_check_write+0x14/0x20
>  SyS_sendfile64+0xa4/0x120
>  ? SyS_sendfile+0x150/0x150
>  ? mark_held_locks+0x23/0xb0
>  ? do_syscall_64+0xc0/0x3e0
>  ? SyS_sendfile+0x150/0x150
>  do_syscall_64+0x1bc/0x3e0
>  ? syscall_return_slowpath+0x240/0x240
>  ? mark_held_locks+0x23/0xb0
>  ? return_from_SYSCALL_64+0x2d/0x7a
>  ? trace_hardirqs_on_caller+0x182/0x260
>  ? trace_hardirqs_on_thunk+0x1a/0x1c
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7fdbc02dd219
> RSP: 002b:7ffc5024fa48 EFLAGS: 0246
>  ORIG_RAX: 0028
> RAX: ffda RBX: 0028 RCX: 7fdbc02dd219
> RDX: 7fdbbe348000 RSI: 0011 RDI: 0015
> RBP: 7ffc5024faf0 R08: 006d R09: 0094e82f2c730a50
> R10: 1000 R11: 0246 R12: 0002
> R13: 7fdbc0885058 R14: 7fdbc09b2698 R15: 7fdbc0885000
> ---[ end trace a5847ef0f7be7e20 ]---
> asked to read 4096, claims to have read 1
> actual size of data in pipe 4096 
> [0:4096]
> f_op: a058c920, f_flags: 49154, pos: 0/1, size: 0
> 
> 
> I'm still trying to narrow down an exact reproducer, but it seems having
> trinity do a combination of sendfile & writev, with pipes and regular
> files as fd's is the best repro.
> 
> Is this a real problem, or am I chasing ghosts ?  That it doesn't happen
> on ext4 or btrfs is making me wonder...

 I haven't heard of any problems w/ directio xfs lately, but OTOH
I think it's the only filesystem that uses iomap_dio_rw, which would
explain why ext4/btrfs don't have this problem.

Granted that's idle speculation; is there a

Re: [PATCH] be2net: Fix some u16 fields appropriately

2017-08-28 Thread David Miller

From: 严海双 
Date: Tue, 29 Aug 2017 09:04:57 +0800

> The GET_TX_COMPL_BITS comes from amap_get which also returns a 32-bit value:

It never returns a value with more than 16-bits of significance for
this specific call.

Please stop trying to be semantically clever when arguing about this
change.

It's not about types, it's about what range of values the struct
member can actually hold.

Re: [PATCH] be2net: Fix some u16 fields appropriately

2017-08-28 Thread David Miller

From: 严海双 
Date: Tue, 29 Aug 2017 09:04:57 +0800

> The GET_TX_COMPL_BITS comes from amap_get which also returns a 32-bit value:

It never returns a value with more than 16-bits of significance for
this specific call.

Please stop trying to be semantically clever when arguing about this
change.

It's not about types, it's about what range of values the struct
member can actually hold.

linux-next: manual merge of the md tree with the block tree

2017-08-28 Thread Stephen Rothwell

Hi Shaohua,

Today's linux-next merge of the md tree got a conflict in:

  drivers/md/raid5-ppl.c

between commit:

  74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions 
index")

from the block tree and commit:

  ddc088238cd6 ("md: Runtime support for multiple ppls")

from the md tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell
diff --cc drivers/md/raid5-ppl.c
index 1e237c40d6fa,a98ef172f8e8..
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@@ -451,12 -456,25 +456,25 @@@ static void ppl_submit_iounit(struct pp
pplhdr->entries_count = cpu_to_le32(io->entries_count);
pplhdr->checksum = cpu_to_le32(~crc32c_le(~0, pplhdr, PPL_HEADER_SIZE));
  
+   /* Rewind the buffer if current PPL is larger then remaining space */
+   if (log->use_multippl &&
+   log->rdev->ppl.sector + log->rdev->ppl.size - log->next_io_sector <
+   (PPL_HEADER_SIZE + io->pp_size) >> 9)
+   log->next_io_sector = log->rdev->ppl.sector;
+ 
+ 
bio->bi_end_io = ppl_log_endio;
bio->bi_opf = REQ_OP_WRITE | REQ_FUA;
 -  bio->bi_bdev = log->rdev->bdev;
 +  bio_set_dev(bio, log->rdev->bdev);
-   bio->bi_iter.bi_sector = log->rdev->ppl.sector;
+   bio->bi_iter.bi_sector = log->next_io_sector;
bio_add_page(bio, io->header_page, PAGE_SIZE, 0);
  
+   pr_debug("%s: log->current_io_sector: %llu\n", __func__,
+   (unsigned long long)log->next_io_sector);
+ 
+   if (log->use_multippl)
+   log->next_io_sector += (PPL_HEADER_SIZE + io->pp_size) >> 9;
+ 
list_for_each_entry(sh, >stripe_list, log_list) {
/* entries for full stripe writes have no partial parity */
if (test_bit(STRIPE_FULL_WRITE, >state))

linux-next: manual merge of the md tree with the block tree

2017-08-28 Thread Stephen Rothwell

Hi Shaohua,

Today's linux-next merge of the md tree got a conflict in:

  drivers/md/raid5-ppl.c

between commit:

  74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions 
index")

from the block tree and commit:

  ddc088238cd6 ("md: Runtime support for multiple ppls")

from the md tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell
diff --cc drivers/md/raid5-ppl.c
index 1e237c40d6fa,a98ef172f8e8..
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@@ -451,12 -456,25 +456,25 @@@ static void ppl_submit_iounit(struct pp
pplhdr->entries_count = cpu_to_le32(io->entries_count);
pplhdr->checksum = cpu_to_le32(~crc32c_le(~0, pplhdr, PPL_HEADER_SIZE));
  
+   /* Rewind the buffer if current PPL is larger then remaining space */
+   if (log->use_multippl &&
+   log->rdev->ppl.sector + log->rdev->ppl.size - log->next_io_sector <
+   (PPL_HEADER_SIZE + io->pp_size) >> 9)
+   log->next_io_sector = log->rdev->ppl.sector;
+ 
+ 
bio->bi_end_io = ppl_log_endio;
bio->bi_opf = REQ_OP_WRITE | REQ_FUA;
 -  bio->bi_bdev = log->rdev->bdev;
 +  bio_set_dev(bio, log->rdev->bdev);
-   bio->bi_iter.bi_sector = log->rdev->ppl.sector;
+   bio->bi_iter.bi_sector = log->next_io_sector;
bio_add_page(bio, io->header_page, PAGE_SIZE, 0);
  
+   pr_debug("%s: log->current_io_sector: %llu\n", __func__,
+   (unsigned long long)log->next_io_sector);
+ 
+   if (log->use_multippl)
+   log->next_io_sector += (PPL_HEADER_SIZE + io->pp_size) >> 9;
+ 
list_for_each_entry(sh, >stripe_list, log_list) {
/* entries for full stripe writes have no partial parity */
if (test_bit(STRIPE_FULL_WRITE, >state))

[PATCH v6] FlexRM support in VFIO platform

2017-08-28 Thread Anup Patel

This patchset primarily adds Broadcom FlexRM reset module for
VFIO platform driver.

The patches are based on Linux-4.13-rc3 and can also be
found at flexrm-vfio-v6 branch of
https://github.com/Broadcom/arm64-linux.git

Changes since v5:
 - Make kconfig option VFIO_PLATFORM_BCMFLEXRM_RESET
   default to ARCH_BCM_IPROC

Changes since v4:
 - Use "--timeout" instead of "timeout--" in
   vfio_platform_bcmflexrm_shutdown()

Changes since v3:
 - Improve "depends on" for Kconfig option
   VFIO_PLATFORM_BCMFLEXRM_RESET
 - Fix typo in pr_warn() called by
   vfio_platform_bcmflexrm_shutdown()
 - Return error from vfio_platform_bcmflexrm_shutdown()
   when FlexRM ring flush timeout happens

Changes since v2:
 - Remove PATCH1 because fixing VFIO no-IOMMU mode is
   a separate topic

Changes since v1:
 - Remove iommu_present() check in vfio_iommu_group_get()
 - Drop PATCH1-to-PATCH3 because IOMMU_CAP_BYPASS is not
   required
 - Move additional comments out of license header in
   vfio_platform_bcmflexrm.c

Anup Patel (1):
  vfio: platform: reset: Add Broadcom FlexRM reset module

 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

-- 
2.7.4

[PATCH v6] FlexRM support in VFIO platform

2017-08-28 Thread Anup Patel

This patchset primarily adds Broadcom FlexRM reset module for
VFIO platform driver.

The patches are based on Linux-4.13-rc3 and can also be
found at flexrm-vfio-v6 branch of
https://github.com/Broadcom/arm64-linux.git

Changes since v5:
 - Make kconfig option VFIO_PLATFORM_BCMFLEXRM_RESET
   default to ARCH_BCM_IPROC

Changes since v4:
 - Use "--timeout" instead of "timeout--" in
   vfio_platform_bcmflexrm_shutdown()

Changes since v3:
 - Improve "depends on" for Kconfig option
   VFIO_PLATFORM_BCMFLEXRM_RESET
 - Fix typo in pr_warn() called by
   vfio_platform_bcmflexrm_shutdown()
 - Return error from vfio_platform_bcmflexrm_shutdown()
   when FlexRM ring flush timeout happens

Changes since v2:
 - Remove PATCH1 because fixing VFIO no-IOMMU mode is
   a separate topic

Changes since v1:
 - Remove iommu_present() check in vfio_iommu_group_get()
 - Drop PATCH1-to-PATCH3 because IOMMU_CAP_BYPASS is not
   required
 - Move additional comments out of license header in
   vfio_platform_bcmflexrm.c

Anup Patel (1):
  vfio: platform: reset: Add Broadcom FlexRM reset module

 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

-- 
2.7.4

[PATCH v6] vfio: platform: reset: Add Broadcom FlexRM reset module

2017-08-28 Thread Anup Patel

This patch adds Broadcom FlexRM low-level reset for
VFIO platform.

It will do the following:
1. Disable/Deactivate each FlexRM ring
2. Flush each FlexRM ring

The cleanup sequence for FlexRM rings is adapted from
Broadcom FlexRM mailbox driver.

Signed-off-by: Anup Patel 
Reviewed-by: Oza Oza 
Reviewed-by: Scott Branden 
Reviewed-by: Eric Auger 
---
 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

diff --git a/drivers/vfio/platform/reset/Kconfig 
b/drivers/vfio/platform/reset/Kconfig
index 705..392e3c0 100644
--- a/drivers/vfio/platform/reset/Kconfig
+++ b/drivers/vfio/platform/reset/Kconfig
@@ -13,3 +13,12 @@ config VFIO_PLATFORM_AMDXGBE_RESET
  Enables the VFIO platform driver to handle reset for AMD XGBE
 
  If you don't know what to do here, say N.
+
+config VFIO_PLATFORM_BCMFLEXRM_RESET
+   tristate "VFIO support for Broadcom FlexRM reset"
+   depends on VFIO_PLATFORM && (ARCH_BCM_IPROC || COMPILE_TEST)
+   default ARCH_BCM_IPROC
+   help
+ Enables the VFIO platform driver to handle reset for Broadcom FlexRM
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/platform/reset/Makefile 
b/drivers/vfio/platform/reset/Makefile
index 93f4e23..8d9874b 100644
--- a/drivers/vfio/platform/reset/Makefile
+++ b/drivers/vfio/platform/reset/Makefile
@@ -5,3 +5,4 @@ ccflags-y += -Idrivers/vfio/platform
 
 obj-$(CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET) += vfio-platform-calxedaxgmac.o
 obj-$(CONFIG_VFIO_PLATFORM_AMDXGBE_RESET) += vfio-platform-amdxgbe.o
+obj-$(CONFIG_VFIO_PLATFORM_BCMFLEXRM_RESET) += vfio_platform_bcmflexrm.o
diff --git a/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c 
b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
new file mode 100644
index 000..966a813
--- /dev/null
+++ b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2017 Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+/*
+ * This driver provides reset support for Broadcom FlexRM ring manager
+ * to VFIO platform.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vfio_platform_private.h"
+
+/* FlexRM configuration */
+#define RING_REGS_SIZE 0x1
+#define RING_VER_MAGIC 0x76303031
+
+/* Per-Ring register offsets */
+#define RING_VER   0x000
+#define RING_CONTROL   0x034
+#define RING_FLUSH_DONE0x038
+
+/* Register RING_CONTROL fields */
+#define CONTROL_FLUSH_SHIFT5
+
+/* Register RING_FLUSH_DONE fields */
+#define FLUSH_DONE_MASK0x1
+
+static int vfio_platform_bcmflexrm_shutdown(void __iomem *ring)
+{
+   unsigned int timeout;
+
+   /* Disable/inactivate ring */
+   writel_relaxed(0x0, ring + RING_CONTROL);
+
+   /* Flush ring with timeout of 1s */
+   timeout = 1000;
+   writel_relaxed(BIT(CONTROL_FLUSH_SHIFT), ring + RING_CONTROL);
+   do {
+   if (readl_relaxed(ring + RING_FLUSH_DONE) & FLUSH_DONE_MASK)
+   break;
+   mdelay(1);
+   } while (--timeout);
+
+   if (!timeout) {
+   pr_warn("VFIO FlexRM shutdown timeout\n");
+   return -ETIMEDOUT;
+   }
+
+   return 0;
+}
+
+static int vfio_platform_bcmflexrm_reset(struct vfio_platform_device *vdev)
+{
+   int rc = 0;
+   void __iomem *ring;
+   struct vfio_platform_region *reg = >regions[0];
+
+   /* Map FlexRM ring registers if not mapped */
+   if (!reg->ioaddr) {
+   reg->ioaddr = ioremap_nocache(reg->addr, reg->size);
+   if (!reg->ioaddr)
+   return -ENOMEM;
+   }
+
+   /* Discover and shutdown each FlexRM ring */
+   for (ring = reg->ioaddr;
+ring < (reg->ioaddr + reg->size); ring += RING_REGS_SIZE) {
+   if (readl_relaxed(ring + RING_VER) == RING_VER_MAGIC) {
+

[PATCH v6] vfio: platform: reset: Add Broadcom FlexRM reset module

2017-08-28 Thread Anup Patel

This patch adds Broadcom FlexRM low-level reset for
VFIO platform.

It will do the following:
1. Disable/Deactivate each FlexRM ring
2. Flush each FlexRM ring

The cleanup sequence for FlexRM rings is adapted from
Broadcom FlexRM mailbox driver.

Signed-off-by: Anup Patel 
Reviewed-by: Oza Oza 
Reviewed-by: Scott Branden 
Reviewed-by: Eric Auger 
---
 drivers/vfio/platform/reset/Kconfig|   9 ++
 drivers/vfio/platform/reset/Makefile   |   1 +
 .../vfio/platform/reset/vfio_platform_bcmflexrm.c  | 100 +
 3 files changed, 110 insertions(+)
 create mode 100644 drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c

diff --git a/drivers/vfio/platform/reset/Kconfig 
b/drivers/vfio/platform/reset/Kconfig
index 705..392e3c0 100644
--- a/drivers/vfio/platform/reset/Kconfig
+++ b/drivers/vfio/platform/reset/Kconfig
@@ -13,3 +13,12 @@ config VFIO_PLATFORM_AMDXGBE_RESET
  Enables the VFIO platform driver to handle reset for AMD XGBE
 
  If you don't know what to do here, say N.
+
+config VFIO_PLATFORM_BCMFLEXRM_RESET
+   tristate "VFIO support for Broadcom FlexRM reset"
+   depends on VFIO_PLATFORM && (ARCH_BCM_IPROC || COMPILE_TEST)
+   default ARCH_BCM_IPROC
+   help
+ Enables the VFIO platform driver to handle reset for Broadcom FlexRM
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/platform/reset/Makefile 
b/drivers/vfio/platform/reset/Makefile
index 93f4e23..8d9874b 100644
--- a/drivers/vfio/platform/reset/Makefile
+++ b/drivers/vfio/platform/reset/Makefile
@@ -5,3 +5,4 @@ ccflags-y += -Idrivers/vfio/platform
 
 obj-$(CONFIG_VFIO_PLATFORM_CALXEDAXGMAC_RESET) += vfio-platform-calxedaxgmac.o
 obj-$(CONFIG_VFIO_PLATFORM_AMDXGBE_RESET) += vfio-platform-amdxgbe.o
+obj-$(CONFIG_VFIO_PLATFORM_BCMFLEXRM_RESET) += vfio_platform_bcmflexrm.o
diff --git a/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c 
b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
new file mode 100644
index 000..966a813
--- /dev/null
+++ b/drivers/vfio/platform/reset/vfio_platform_bcmflexrm.c
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2017 Broadcom
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+/*
+ * This driver provides reset support for Broadcom FlexRM ring manager
+ * to VFIO platform.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vfio_platform_private.h"
+
+/* FlexRM configuration */
+#define RING_REGS_SIZE 0x1
+#define RING_VER_MAGIC 0x76303031
+
+/* Per-Ring register offsets */
+#define RING_VER   0x000
+#define RING_CONTROL   0x034
+#define RING_FLUSH_DONE0x038
+
+/* Register RING_CONTROL fields */
+#define CONTROL_FLUSH_SHIFT5
+
+/* Register RING_FLUSH_DONE fields */
+#define FLUSH_DONE_MASK0x1
+
+static int vfio_platform_bcmflexrm_shutdown(void __iomem *ring)
+{
+   unsigned int timeout;
+
+   /* Disable/inactivate ring */
+   writel_relaxed(0x0, ring + RING_CONTROL);
+
+   /* Flush ring with timeout of 1s */
+   timeout = 1000;
+   writel_relaxed(BIT(CONTROL_FLUSH_SHIFT), ring + RING_CONTROL);
+   do {
+   if (readl_relaxed(ring + RING_FLUSH_DONE) & FLUSH_DONE_MASK)
+   break;
+   mdelay(1);
+   } while (--timeout);
+
+   if (!timeout) {
+   pr_warn("VFIO FlexRM shutdown timeout\n");
+   return -ETIMEDOUT;
+   }
+
+   return 0;
+}
+
+static int vfio_platform_bcmflexrm_reset(struct vfio_platform_device *vdev)
+{
+   int rc = 0;
+   void __iomem *ring;
+   struct vfio_platform_region *reg = >regions[0];
+
+   /* Map FlexRM ring registers if not mapped */
+   if (!reg->ioaddr) {
+   reg->ioaddr = ioremap_nocache(reg->addr, reg->size);
+   if (!reg->ioaddr)
+   return -ENOMEM;
+   }
+
+   /* Discover and shutdown each FlexRM ring */
+   for (ring = reg->ioaddr;
+ring < (reg->ioaddr + reg->size); ring += RING_REGS_SIZE) {
+   if (readl_relaxed(ring + RING_VER) == RING_VER_MAGIC) {
+   rc = vfio_platform_bcmflexrm_shutdown(ring);
+   if (rc)
+

Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Florian Fainelli

On 08/28/2017 08:09 PM, Nicholas Piggin wrote:
> On Mon, 28 Aug 2017 13:03:31 -0700
> Florian Fainelli  wrote:
> 
>> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
>>> On Sat, 20 May 2017 20:33:35 -0700
>>> Florian Fainelli  wrote:
>>>   
 Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
 initram compression algorithm") introduced the possibility to select the
 initramfs compression algorithm from Kconfig and while this is a nice
 feature it broke the use case described below.

 Here is what my build system does:

 - kernel is initially configured not to have an initramfs included
 - build the user space root file system
 - re-configure the kernel to have an initramfs included
 (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
 CONFIG_INITRAMFS options, in my case, no compression option
 (CONFIG_INITRAMFS_COMPRESSION_NONE)
 - kernel is re-built with these options -> kernel+initramfs image is
   copied
 - kernel is re-built again without these options -> kernel image is
   copied

 Building a kernel without an initramfs means setting this option:

 CONFIG_INITRAMFS_SOURCE="" (and this one only)

 whereas building a kernel with an initramfs means setting these options:

 CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
 /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
 CONFIG_INITRAMFS_ROOT_UID=1000
 CONFIG_INITRAMFS_ROOT_GID=1000
 CONFIG_INITRAMFS_COMPRESSION_NONE=y
 CONFIG_INITRAMFS_COMPRESSION=""

 Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
 choice of the embedded initram compression algorithm") is problematic
 because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
 initramfs_data.cpio extension/compression is a string, and due to how
 Kconfig works it will evaluate in order, how to assign it.

 Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
 CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
 on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
 still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
 CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
 being built.

 So we basically end-up generating two initramfs_data.cpio* files, one
 without extension, and one with .gz. This causes usr/Makefile to track
 usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
 that is also largely problematic after
 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
 set target from Kconfig") because we used to track all possible
 initramfs_data files in the $(targets) variable before that commit.

 The end result is that the kernel with an initramfs clearly does not
 contain what we expect it to, it has a stale initramfs_data.cpio file
 built into it, and we keep re-generating an initramfs_data.cpio.gz file
 which is not the one that we want to include in the kernel image proper.

 The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
 CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
 behavior where we can properly disable and re-enable initramfs within
 the same kernel .config file, and be in control of what
 CONFIG_INITRAMFS_COMPRESSION is set to.

 Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
 initram compression algorithm")
 Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
 Signed-off-by: Florian Fainelli   
>>>
>>> This is very thorough, thank you for tracking it down and fixing it.
>>>
>>> I can't say I've worked through the problem in the code, but your
>>> changelog and the proposed fix seem reasonable to me. So for what
>>> it's worth:
>>>
>>> Acked-by: Nicholas Piggin   
>>
>> Well, I am looking at this again, and it's still broken, the same test
>> case is involved, except this time, I am switching beween no-initramfs
>> and initramfs with gzip compression (the key thing is using a
>> compression of some sort). The end result is the following:
>>
>> - change stuff in the rootfs
>> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
>> usr/initramfs_data.cpio.gz gets generated correctly the first time
>> - build the kernel without initramfs,
>> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
>>
>> Now back to step 1 add some files, and we can see that
>> usr/initramfs_data.cpio.gz is now stale from before...
>>
>> So while my earlier fix switched the initramfs w/o compression to no
>> initramfs rebuild, now this does not work because we still have two
>> files left to be tracked:
>>
>> usr/initramfs_data.cpio (no compression, or when

Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Florian Fainelli

On 08/28/2017 08:09 PM, Nicholas Piggin wrote:
> On Mon, 28 Aug 2017 13:03:31 -0700
> Florian Fainelli  wrote:
> 
>> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
>>> On Sat, 20 May 2017 20:33:35 -0700
>>> Florian Fainelli  wrote:
>>>   
 Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
 initram compression algorithm") introduced the possibility to select the
 initramfs compression algorithm from Kconfig and while this is a nice
 feature it broke the use case described below.

 Here is what my build system does:

 - kernel is initially configured not to have an initramfs included
 - build the user space root file system
 - re-configure the kernel to have an initramfs included
 (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
 CONFIG_INITRAMFS options, in my case, no compression option
 (CONFIG_INITRAMFS_COMPRESSION_NONE)
 - kernel is re-built with these options -> kernel+initramfs image is
   copied
 - kernel is re-built again without these options -> kernel image is
   copied

 Building a kernel without an initramfs means setting this option:

 CONFIG_INITRAMFS_SOURCE="" (and this one only)

 whereas building a kernel with an initramfs means setting these options:

 CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
 /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
 CONFIG_INITRAMFS_ROOT_UID=1000
 CONFIG_INITRAMFS_ROOT_GID=1000
 CONFIG_INITRAMFS_COMPRESSION_NONE=y
 CONFIG_INITRAMFS_COMPRESSION=""

 Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
 choice of the embedded initram compression algorithm") is problematic
 because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
 initramfs_data.cpio extension/compression is a string, and due to how
 Kconfig works it will evaluate in order, how to assign it.

 Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
 CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
 on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
 still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
 CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
 being built.

 So we basically end-up generating two initramfs_data.cpio* files, one
 without extension, and one with .gz. This causes usr/Makefile to track
 usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
 that is also largely problematic after
 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
 set target from Kconfig") because we used to track all possible
 initramfs_data files in the $(targets) variable before that commit.

 The end result is that the kernel with an initramfs clearly does not
 contain what we expect it to, it has a stale initramfs_data.cpio file
 built into it, and we keep re-generating an initramfs_data.cpio.gz file
 which is not the one that we want to include in the kernel image proper.

 The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
 CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
 behavior where we can properly disable and re-enable initramfs within
 the same kernel .config file, and be in control of what
 CONFIG_INITRAMFS_COMPRESSION is set to.

 Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
 initram compression algorithm")
 Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
 Signed-off-by: Florian Fainelli   
>>>
>>> This is very thorough, thank you for tracking it down and fixing it.
>>>
>>> I can't say I've worked through the problem in the code, but your
>>> changelog and the proposed fix seem reasonable to me. So for what
>>> it's worth:
>>>
>>> Acked-by: Nicholas Piggin   
>>
>> Well, I am looking at this again, and it's still broken, the same test
>> case is involved, except this time, I am switching beween no-initramfs
>> and initramfs with gzip compression (the key thing is using a
>> compression of some sort). The end result is the following:
>>
>> - change stuff in the rootfs
>> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
>> usr/initramfs_data.cpio.gz gets generated correctly the first time
>> - build the kernel without initramfs,
>> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
>>
>> Now back to step 1 add some files, and we can see that
>> usr/initramfs_data.cpio.gz is now stale from before...
>>
>> So while my earlier fix switched the initramfs w/o compression to no
>> initramfs rebuild, now this does not work because we still have two
>> files left to be tracked:
>>
>> usr/initramfs_data.cpio (no compression, or when initramfs is disabled)
>> and usr/initramfs_data.cpio.$(suffix_y)
>>
>> How would you go

[PATCH v5 1/2] dt-bindings: i2c: Add Spreadtrum I2C controller documentation

2017-08-28 Thread Baolin Wang

This patch adds the binding documentation for Spreadtrum I2C
controller device.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
---
Changes since v4:
 - No updates.

Changes since v3:
 - Add Ack from RobH.

Changes since v2:
 - Change compatible strings to be SoC specific.

Changes since v1:
 - No updates.
---
 Documentation/devicetree/bindings/i2c/i2c-sprd.txt |   31 
 1 file changed, 31 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/i2c-sprd.txt

diff --git a/Documentation/devicetree/bindings/i2c/i2c-sprd.txt 
b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
new file mode 100644
index 000..60b7cda
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
@@ -0,0 +1,31 @@
+I2C for Spreadtrum platforms
+
+Required properties:
+- compatible: Should be "sprd,sc9860-i2c".
+- reg: Specify the physical base address of the controller and length
+  of memory mapped region.
+- interrupts: Should contain I2C interrupt.
+- clock-names: Should contain following entries:
+  "i2c" for I2C clock,
+  "source" for I2C source (parent) clock,
+  "enable" for I2C module enable clock.
+- clocks: Should contain a clock specifier for each entry in clock-names.
+- clock-frequency: Constains desired I2C bus clock frequency in Hz.
+- #address-cells: Should be 1 to describe address cells for I2C device address.
+- #size-cells: Should be 0 means no size cell for I2C device address.
+
+Optional properties:
+- Child nodes conforming to I2C bus binding
+
+Examples:
+i2c0: i2c@7050 {
+   compatible = "sprd,sc9860-i2c";
+   reg = <0 0x7050 0 0x1000>;
+   interrupts = ;
+   clock-names = "i2c", "source", "enable";
+   clocks = <_i2c3>, <_26m>, <_ap_apb_gates 11>;
+   clock-frequency = <40>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+};
+
-- 
1.7.9.5

[PATCH v5 1/2] dt-bindings: i2c: Add Spreadtrum I2C controller documentation

2017-08-28 Thread Baolin Wang

This patch adds the binding documentation for Spreadtrum I2C
controller device.

Signed-off-by: Baolin Wang 
Acked-by: Rob Herring 
---
Changes since v4:
 - No updates.

Changes since v3:
 - Add Ack from RobH.

Changes since v2:
 - Change compatible strings to be SoC specific.

Changes since v1:
 - No updates.
---
 Documentation/devicetree/bindings/i2c/i2c-sprd.txt |   31 
 1 file changed, 31 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/i2c-sprd.txt

diff --git a/Documentation/devicetree/bindings/i2c/i2c-sprd.txt 
b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
new file mode 100644
index 000..60b7cda
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-sprd.txt
@@ -0,0 +1,31 @@
+I2C for Spreadtrum platforms
+
+Required properties:
+- compatible: Should be "sprd,sc9860-i2c".
+- reg: Specify the physical base address of the controller and length
+  of memory mapped region.
+- interrupts: Should contain I2C interrupt.
+- clock-names: Should contain following entries:
+  "i2c" for I2C clock,
+  "source" for I2C source (parent) clock,
+  "enable" for I2C module enable clock.
+- clocks: Should contain a clock specifier for each entry in clock-names.
+- clock-frequency: Constains desired I2C bus clock frequency in Hz.
+- #address-cells: Should be 1 to describe address cells for I2C device address.
+- #size-cells: Should be 0 means no size cell for I2C device address.
+
+Optional properties:
+- Child nodes conforming to I2C bus binding
+
+Examples:
+i2c0: i2c@7050 {
+   compatible = "sprd,sc9860-i2c";
+   reg = <0 0x7050 0 0x1000>;
+   interrupts = ;
+   clock-names = "i2c", "source", "enable";
+   clocks = <_i2c3>, <_26m>, <_ap_apb_gates 11>;
+   clock-frequency = <40>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+};
+
-- 
1.7.9.5

[PATCH v5 2/2] i2c: Add Spreadtrum I2C controller driver

2017-08-28 Thread Baolin Wang

This patch adds the I2C controller driver for Spreadtrum SC9860 platform.

Signed-off-by: Baolin Wang 
Reviewed-by: Andy Shevchenko 
---
Changes since v4:
 - Remove dump registers function.
 - Change 'unsigned int' to 'u32' type.
 - Invert ack logic to make it clear.
 - Modify some comments and error message.

Changes since v3:
 - Use SPDX-License-Identifier tag instead.

Changes since v2:
 - Remove some redundant comments and parens.
 - Define macros instead of magic number.
 - Add some comments to explain clock formula.
 - Change of_clk_get_by_name() to devm_clk_get().
 - Deal with other frequency.
 - Change register definiton to low case.
 - Change is_last_msg to boolean.
 - Other optimization.

Changes sice v1:
 - Power on I2C device in probe().
 - Remove redundant macros and usb __maybe_unused.
 - Remove redundant 'of_match_ptr'.
 - Modify return values and check the return value for 'clk_prepare_enable'.
---
 drivers/i2c/busses/Kconfig|7 +
 drivers/i2c/busses/Makefile   |1 +
 drivers/i2c/busses/i2c-sprd.c |  646 +
 3 files changed, 654 insertions(+)
 create mode 100644 drivers/i2c/busses/i2c-sprd.c

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 1006b23..64729ac 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -900,6 +900,13 @@ config I2C_SIRF
  This driver can also be built as a module.  If so, the module
  will be called i2c-sirf.
 
+config I2C_SPRD
+   bool "Spreadtrum I2C interface"
+   depends on ARCH_SPRD
+   help
+ If you say yes to this option, support will be included for the
+ Spreadtrum I2C interface.
+
 config I2C_ST
tristate "STMicroelectronics SSC I2C support"
depends on ARCH_STI
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 1b2fc81..505f74a 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -89,6 +89,7 @@ obj-$(CONFIG_I2C_SH7760)  += i2c-sh7760.o
 obj-$(CONFIG_I2C_SH_MOBILE)+= i2c-sh_mobile.o
 obj-$(CONFIG_I2C_SIMTEC)   += i2c-simtec.o
 obj-$(CONFIG_I2C_SIRF) += i2c-sirf.o
+obj-$(CONFIG_I2C_SPRD) += i2c-sprd.o
 obj-$(CONFIG_I2C_ST)   += i2c-st.o
 obj-$(CONFIG_I2C_STM32F4)  += i2c-stm32f4.o
 obj-$(CONFIG_I2C_STU300)   += i2c-stu300.o
diff --git a/drivers/i2c/busses/i2c-sprd.c b/drivers/i2c/busses/i2c-sprd.c
new file mode 100644
index 000..22e08ae
--- /dev/null
+++ b/drivers/i2c/busses/i2c-sprd.c
@@ -0,0 +1,646 @@
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define I2C_CTL0x00
+#define I2C_ADDR_CFG   0x04
+#define I2C_COUNT  0x08
+#define I2C_RX 0x0c
+#define I2C_TX 0x10
+#define I2C_STATUS 0x14
+#define I2C_HSMODE_CFG 0x18
+#define I2C_VERSION0x1c
+#define ADDR_DVD0  0x20
+#define ADDR_DVD1  0x24
+#define ADDR_STA0_DVD  0x28
+#define ADDR_RST   0x2c
+
+/* I2C_CTL */
+#define STP_EN BIT(20)
+#define FIFO_AF_LVL_MASK   GENMASK(19, 16)
+#define FIFO_AF_LVL16
+#define FIFO_AE_LVL_MASK   GENMASK(15, 12)
+#define FIFO_AE_LVL12
+#define I2C_DMA_EN BIT(11)
+#define FULL_INTEN BIT(10)
+#define EMPTY_INTENBIT(9)
+#define I2C_DVD_OPTBIT(8)
+#define I2C_OUT_OPTBIT(7)
+#define I2C_TRIM_OPT   BIT(6)
+#define I2C_HS_MODEBIT(4)
+#define I2C_MODE   BIT(3)
+#define I2C_EN BIT(2)
+#define I2C_INT_EN BIT(1)
+#define I2C_START  BIT(0)
+
+/* I2C_STATUS */
+#define SDA_IN BIT(21)
+#define SCL_IN BIT(20)
+#define FIFO_FULL  BIT(4)
+#define FIFO_EMPTY BIT(3)
+#define I2C_INTBIT(2)
+#define I2C_RX_ACK BIT(1)
+#define I2C_BUSY   BIT(0)
+
+/* ADDR_RST */
+#define I2C_RSTBIT(0)
+
+#define I2C_FIFO_DEEP  12
+#define I2C_FIFO_FULL_THLD 15
+#define I2C_FIFO_EMPTY_THLD4
+#define I2C_DATA_STEP  8
+#define I2C_ADDR_DVD0_CALC(high, low)  \
+   high) & GENMASK(15, 0)) << 16) | ((low) & GENMASK(15, 0)))
+#define I2C_ADDR_DVD1_CALC(high, low)  \
+   (((high) & GENMASK(31, 16)) | (((low) & GENMASK(31, 16)) >> 16))
+
+/* timeout (ms) for pm runtime autosuspend */
+#define SPRD_I2C_PM_TIMEOUT1000
+
+/* SPRD i2c data structure */
+struct sprd_i2c {
+   struct i2c_adapter adap;
+   struct device *dev;
+   void __iomem *base;
+   struct i2c_msg *msg;
+   struct clk *clk;
+   u32 src_clk;
+

[PATCH v5 2/2] i2c: Add Spreadtrum I2C controller driver

2017-08-28 Thread Baolin Wang

This patch adds the I2C controller driver for Spreadtrum SC9860 platform.

Signed-off-by: Baolin Wang 
Reviewed-by: Andy Shevchenko 
---
Changes since v4:
 - Remove dump registers function.
 - Change 'unsigned int' to 'u32' type.
 - Invert ack logic to make it clear.
 - Modify some comments and error message.

Changes since v3:
 - Use SPDX-License-Identifier tag instead.

Changes since v2:
 - Remove some redundant comments and parens.
 - Define macros instead of magic number.
 - Add some comments to explain clock formula.
 - Change of_clk_get_by_name() to devm_clk_get().
 - Deal with other frequency.
 - Change register definiton to low case.
 - Change is_last_msg to boolean.
 - Other optimization.

Changes sice v1:
 - Power on I2C device in probe().
 - Remove redundant macros and usb __maybe_unused.
 - Remove redundant 'of_match_ptr'.
 - Modify return values and check the return value for 'clk_prepare_enable'.
---
 drivers/i2c/busses/Kconfig|7 +
 drivers/i2c/busses/Makefile   |1 +
 drivers/i2c/busses/i2c-sprd.c |  646 +
 3 files changed, 654 insertions(+)
 create mode 100644 drivers/i2c/busses/i2c-sprd.c

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 1006b23..64729ac 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -900,6 +900,13 @@ config I2C_SIRF
  This driver can also be built as a module.  If so, the module
  will be called i2c-sirf.
 
+config I2C_SPRD
+   bool "Spreadtrum I2C interface"
+   depends on ARCH_SPRD
+   help
+ If you say yes to this option, support will be included for the
+ Spreadtrum I2C interface.
+
 config I2C_ST
tristate "STMicroelectronics SSC I2C support"
depends on ARCH_STI
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 1b2fc81..505f74a 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -89,6 +89,7 @@ obj-$(CONFIG_I2C_SH7760)  += i2c-sh7760.o
 obj-$(CONFIG_I2C_SH_MOBILE)+= i2c-sh_mobile.o
 obj-$(CONFIG_I2C_SIMTEC)   += i2c-simtec.o
 obj-$(CONFIG_I2C_SIRF) += i2c-sirf.o
+obj-$(CONFIG_I2C_SPRD) += i2c-sprd.o
 obj-$(CONFIG_I2C_ST)   += i2c-st.o
 obj-$(CONFIG_I2C_STM32F4)  += i2c-stm32f4.o
 obj-$(CONFIG_I2C_STU300)   += i2c-stu300.o
diff --git a/drivers/i2c/busses/i2c-sprd.c b/drivers/i2c/busses/i2c-sprd.c
new file mode 100644
index 000..22e08ae
--- /dev/null
+++ b/drivers/i2c/busses/i2c-sprd.c
@@ -0,0 +1,646 @@
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ *
+ * SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define I2C_CTL0x00
+#define I2C_ADDR_CFG   0x04
+#define I2C_COUNT  0x08
+#define I2C_RX 0x0c
+#define I2C_TX 0x10
+#define I2C_STATUS 0x14
+#define I2C_HSMODE_CFG 0x18
+#define I2C_VERSION0x1c
+#define ADDR_DVD0  0x20
+#define ADDR_DVD1  0x24
+#define ADDR_STA0_DVD  0x28
+#define ADDR_RST   0x2c
+
+/* I2C_CTL */
+#define STP_EN BIT(20)
+#define FIFO_AF_LVL_MASK   GENMASK(19, 16)
+#define FIFO_AF_LVL16
+#define FIFO_AE_LVL_MASK   GENMASK(15, 12)
+#define FIFO_AE_LVL12
+#define I2C_DMA_EN BIT(11)
+#define FULL_INTEN BIT(10)
+#define EMPTY_INTENBIT(9)
+#define I2C_DVD_OPTBIT(8)
+#define I2C_OUT_OPTBIT(7)
+#define I2C_TRIM_OPT   BIT(6)
+#define I2C_HS_MODEBIT(4)
+#define I2C_MODE   BIT(3)
+#define I2C_EN BIT(2)
+#define I2C_INT_EN BIT(1)
+#define I2C_START  BIT(0)
+
+/* I2C_STATUS */
+#define SDA_IN BIT(21)
+#define SCL_IN BIT(20)
+#define FIFO_FULL  BIT(4)
+#define FIFO_EMPTY BIT(3)
+#define I2C_INTBIT(2)
+#define I2C_RX_ACK BIT(1)
+#define I2C_BUSY   BIT(0)
+
+/* ADDR_RST */
+#define I2C_RSTBIT(0)
+
+#define I2C_FIFO_DEEP  12
+#define I2C_FIFO_FULL_THLD 15
+#define I2C_FIFO_EMPTY_THLD4
+#define I2C_DATA_STEP  8
+#define I2C_ADDR_DVD0_CALC(high, low)  \
+   high) & GENMASK(15, 0)) << 16) | ((low) & GENMASK(15, 0)))
+#define I2C_ADDR_DVD1_CALC(high, low)  \
+   (((high) & GENMASK(31, 16)) | (((low) & GENMASK(31, 16)) >> 16))
+
+/* timeout (ms) for pm runtime autosuspend */
+#define SPRD_I2C_PM_TIMEOUT1000
+
+/* SPRD i2c data structure */
+struct sprd_i2c {
+   struct i2c_adapter adap;
+   struct device *dev;
+   void __iomem *base;
+   struct i2c_msg *msg;
+   struct clk *clk;
+   u32 src_clk;
+   u32 bus_freq;
+   struct completion complete;

Re: [PATCH] libnvdimm: clean up command definitions

2017-08-28 Thread Dan Williams

On Mon, Aug 28, 2017 at 6:03 PM, Yasunori Goto  wrote:
>> On Mon, Aug 28, 2017 at 1:50 PM, Jerry Hoemann  wrote:
>> >
>> > On Mon, Aug 28, 2017 at 08:45:32AM -0700, Dan Williams wrote:
>> >> Remove the command payloads that do not have an associated libnvdimm
>> >> ioctl. I.e. remove the payloads that would only ever be carried in the
>> >> ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
>> >> dependencies on this kernel header when userspace already has everything
>> >> it needs to craft and send these commands.
>> >
>> > Userspace needs to include linux/ndctl.h to make the call as
>> > that is where nd_cmd_pkg is defined.
>> >
>> > So you want to have some structures defined in ndctl.h and other
>> > defined in the to be created libndctl-nfit.h?  Plus a third header
>> > file for the HPE non-root calls?
>>
>> Yes. ndctl.h exports the ioctl command payloads, everything that goes
>> inside of ND_CMD_CALL is defined by userspace headers. The
>> libndctl-nfit.h header is proposed as a place to land vendor agnostic
>> NFIT-defined payloads, and any vendor specific definitions would
>> remain internal to libndctl as they are today.
>>
>> > Will libndctl-nfit.h be generally available and installed?
>>
>> Yes, that's the plan.
>>
>> > Will it be clean so that other applications can use it to get these
>> > definitions?  Or will it be loaded w/ a bunch of stuff only useful
>> > to your ndctl command?
>>
>> Yes, that's the plan. It's a bug if libndctl-nfit.h is not generically
>> clean for issuing the NFIT root device commands via some ND_CMD_CALL
>> helpers from the base libndctl library.
>>
>> In other words libndctl-nfit.h defines the payload and libndctl
>> defines some general helpers for issuing commands.
>
> Maybe I don't understand your idea yet, let me confirm it.
>
> Certainly, current acpi driver does not need these definitions.
> But, I think nfit_test.ko will need them to emulate these features.
>
> Do you intend that libndctl-nfit.h should be defined at "include/uapi/linux/"
> directory?
> Otherwise, it should be defined at "tools/testing/nvdimm/" or
> "tools/testing/nvdimm/test" ?

nfit_test will need its own internal / private copy of these payloads
in tools/testing/nvdimm/test so it can emulate how the bios behaves.
The include/uapi/linux directory is for user to kernel interface
definitions and these command payloads are purely an interface to bios
/ firmware.

Re: [PATCH] libnvdimm: clean up command definitions

2017-08-28 Thread Dan Williams

On Mon, Aug 28, 2017 at 6:03 PM, Yasunori Goto  wrote:
>> On Mon, Aug 28, 2017 at 1:50 PM, Jerry Hoemann  wrote:
>> >
>> > On Mon, Aug 28, 2017 at 08:45:32AM -0700, Dan Williams wrote:
>> >> Remove the command payloads that do not have an associated libnvdimm
>> >> ioctl. I.e. remove the payloads that would only ever be carried in the
>> >> ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
>> >> dependencies on this kernel header when userspace already has everything
>> >> it needs to craft and send these commands.
>> >
>> > Userspace needs to include linux/ndctl.h to make the call as
>> > that is where nd_cmd_pkg is defined.
>> >
>> > So you want to have some structures defined in ndctl.h and other
>> > defined in the to be created libndctl-nfit.h?  Plus a third header
>> > file for the HPE non-root calls?
>>
>> Yes. ndctl.h exports the ioctl command payloads, everything that goes
>> inside of ND_CMD_CALL is defined by userspace headers. The
>> libndctl-nfit.h header is proposed as a place to land vendor agnostic
>> NFIT-defined payloads, and any vendor specific definitions would
>> remain internal to libndctl as they are today.
>>
>> > Will libndctl-nfit.h be generally available and installed?
>>
>> Yes, that's the plan.
>>
>> > Will it be clean so that other applications can use it to get these
>> > definitions?  Or will it be loaded w/ a bunch of stuff only useful
>> > to your ndctl command?
>>
>> Yes, that's the plan. It's a bug if libndctl-nfit.h is not generically
>> clean for issuing the NFIT root device commands via some ND_CMD_CALL
>> helpers from the base libndctl library.
>>
>> In other words libndctl-nfit.h defines the payload and libndctl
>> defines some general helpers for issuing commands.
>
> Maybe I don't understand your idea yet, let me confirm it.
>
> Certainly, current acpi driver does not need these definitions.
> But, I think nfit_test.ko will need them to emulate these features.
>
> Do you intend that libndctl-nfit.h should be defined at "include/uapi/linux/"
> directory?
> Otherwise, it should be defined at "tools/testing/nvdimm/" or
> "tools/testing/nvdimm/test" ?

nfit_test will need its own internal / private copy of these payloads
in tools/testing/nvdimm/test so it can emulate how the bios behaves.
The include/uapi/linux directory is for user to kernel interface
definitions and these command payloads are purely an interface to bios
/ firmware.

Re: [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Wei Wang


On 08/28/2017 09:33 PM, Michal Hocko wrote:

On Mon 28-08-17 18:08:32, Wei Wang wrote:

This patch adds support to walk through the free page blocks in the
system and report them via a callback function. Some page blocks may
leave the free list after zone->lock is released, so it is the caller's
responsibility to either detect or prevent the use of such pages.

One use example of this patch is to accelerate live migration by skipping
the transfer of free pages reported from the guest. A popular method used
by the hypervisor to track which part of memory is written during live
migration is to write-protect all the guest memory. So, those pages that
are reported as free pages but are written after the report function
returns will be captured by the hypervisor, and they will be added to the
next round of memory transfer.

OK, looks much better. I still have few nits.


+extern void walk_free_mem_block(void *opaque,
+   int min_order,
+   bool (*report_page_block)(void *, unsigned long,
+ unsigned long));
+

please add names to arguments of the prototype


  /*
   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
   * into the buddy system. The freed pages will be poisoned with pattern
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d00f74..81eedc7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t 
*nodemask)
show_swap_cache_info();
  }
  
+/**

+ * walk_free_mem_block - Walk through the free page blocks in the system
+ * @opaque: the context passed from the caller
+ * @min_order: the minimum order of free lists to check
+ * @report_page_block: the callback function to report free page blocks

page_block has meaning in the core MM which doesn't strictly match its
usage here. Moreover we are reporting pfn ranges rather than struct page
range. So report_pfn_range would suit better.

[...]

+   for_each_populated_zone(zone) {
+   for (order = MAX_ORDER - 1; order >= min_order; order--) {
+   for (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {
+   spin_lock_irqsave(>lock, flags);
+   list = >free_area[order].free_list[mt];
+   list_for_each_entry(page, list, lru) {
+   pfn = page_to_pfn(page);
+   stop = report_page_block(opaque, pfn,
+1 << order);
+   if (stop)
+   break;

if (stop) {

spin_unlock_irqrestore(>lock, flags);
return;
}

would be both easier and less error prone. E.g. You wouldn't pointlessly
iterate over remaining orders just to realize there is nothing to be
done for those...



Yes, that's better, thanks. I will take other suggestions as well.

Best,
Wei

Re: [PATCH v15 4/5] mm: support reporting free page blocks

2017-08-28 Thread Wei Wang


On 08/28/2017 09:33 PM, Michal Hocko wrote:

On Mon 28-08-17 18:08:32, Wei Wang wrote:

This patch adds support to walk through the free page blocks in the
system and report them via a callback function. Some page blocks may
leave the free list after zone->lock is released, so it is the caller's
responsibility to either detect or prevent the use of such pages.

One use example of this patch is to accelerate live migration by skipping
the transfer of free pages reported from the guest. A popular method used
by the hypervisor to track which part of memory is written during live
migration is to write-protect all the guest memory. So, those pages that
are reported as free pages but are written after the report function
returns will be captured by the hypervisor, and they will be added to the
next round of memory transfer.

OK, looks much better. I still have few nits.


+extern void walk_free_mem_block(void *opaque,
+   int min_order,
+   bool (*report_page_block)(void *, unsigned long,
+ unsigned long));
+

please add names to arguments of the prototype


  /*
   * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)
   * into the buddy system. The freed pages will be poisoned with pattern
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d00f74..81eedc7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t 
*nodemask)
show_swap_cache_info();
  }
  
+/**

+ * walk_free_mem_block - Walk through the free page blocks in the system
+ * @opaque: the context passed from the caller
+ * @min_order: the minimum order of free lists to check
+ * @report_page_block: the callback function to report free page blocks

page_block has meaning in the core MM which doesn't strictly match its
usage here. Moreover we are reporting pfn ranges rather than struct page
range. So report_pfn_range would suit better.

[...]

+   for_each_populated_zone(zone) {
+   for (order = MAX_ORDER - 1; order >= min_order; order--) {
+   for (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {
+   spin_lock_irqsave(>lock, flags);
+   list = >free_area[order].free_list[mt];
+   list_for_each_entry(page, list, lru) {
+   pfn = page_to_pfn(page);
+   stop = report_page_block(opaque, pfn,
+1 << order);
+   if (stop)
+   break;

if (stop) {

spin_unlock_irqrestore(>lock, flags);
return;
}

would be both easier and less error prone. E.g. You wouldn't pointlessly
iterate over remaining orders just to realize there is nothing to be
done for those...



Yes, that's better, thanks. I will take other suggestions as well.

Best,
Wei

Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Nicholas Piggin

On Mon, 28 Aug 2017 13:03:31 -0700
Florian Fainelli  wrote:

> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
> > On Sat, 20 May 2017 20:33:35 -0700
> > Florian Fainelli  wrote:
> >   
> >> Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
> >> initram compression algorithm") introduced the possibility to select the
> >> initramfs compression algorithm from Kconfig and while this is a nice
> >> feature it broke the use case described below.
> >>
> >> Here is what my build system does:
> >>
> >> - kernel is initially configured not to have an initramfs included
> >> - build the user space root file system
> >> - re-configure the kernel to have an initramfs included
> >> (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
> >> CONFIG_INITRAMFS options, in my case, no compression option
> >> (CONFIG_INITRAMFS_COMPRESSION_NONE)
> >> - kernel is re-built with these options -> kernel+initramfs image is
> >>   copied
> >> - kernel is re-built again without these options -> kernel image is
> >>   copied
> >>
> >> Building a kernel without an initramfs means setting this option:
> >>
> >> CONFIG_INITRAMFS_SOURCE="" (and this one only)
> >>
> >> whereas building a kernel with an initramfs means setting these options:
> >>
> >> CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
> >> /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
> >> CONFIG_INITRAMFS_ROOT_UID=1000
> >> CONFIG_INITRAMFS_ROOT_GID=1000
> >> CONFIG_INITRAMFS_COMPRESSION_NONE=y
> >> CONFIG_INITRAMFS_COMPRESSION=""
> >>
> >> Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
> >> choice of the embedded initram compression algorithm") is problematic
> >> because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
> >> initramfs_data.cpio extension/compression is a string, and due to how
> >> Kconfig works it will evaluate in order, how to assign it.
> >>
> >> Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
> >> CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
> >> on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
> >> still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
> >> CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
> >> being built.
> >>
> >> So we basically end-up generating two initramfs_data.cpio* files, one
> >> without extension, and one with .gz. This causes usr/Makefile to track
> >> usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
> >> that is also largely problematic after
> >> 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
> >> set target from Kconfig") because we used to track all possible
> >> initramfs_data files in the $(targets) variable before that commit.
> >>
> >> The end result is that the kernel with an initramfs clearly does not
> >> contain what we expect it to, it has a stale initramfs_data.cpio file
> >> built into it, and we keep re-generating an initramfs_data.cpio.gz file
> >> which is not the one that we want to include in the kernel image proper.
> >>
> >> The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
> >> CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
> >> behavior where we can properly disable and re-enable initramfs within
> >> the same kernel .config file, and be in control of what
> >> CONFIG_INITRAMFS_COMPRESSION is set to.
> >>
> >> Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
> >> initram compression algorithm")
> >> Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
> >> Signed-off-by: Florian Fainelli   
> > 
> > This is very thorough, thank you for tracking it down and fixing it.
> > 
> > I can't say I've worked through the problem in the code, but your
> > changelog and the proposed fix seem reasonable to me. So for what
> > it's worth:
> > 
> > Acked-by: Nicholas Piggin   
> 
> Well, I am looking at this again, and it's still broken, the same test
> case is involved, except this time, I am switching beween no-initramfs
> and initramfs with gzip compression (the key thing is using a
> compression of some sort). The end result is the following:
> 
> - change stuff in the rootfs
> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
> usr/initramfs_data.cpio.gz gets generated correctly the first time
> - build the kernel without initramfs,
> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
> 
> Now back to step 1 add some files, and we can see that
> usr/initramfs_data.cpio.gz is now stale from before...
> 
> So while my earlier fix switched the initramfs w/o compression to no
> initramfs rebuild, now this does not work because we still have two
> files left to be tracked:
> 
> usr/initramfs_data.cpio (no compression, or when initramfs is disabled)
> and usr/initramfs_data.cpio.$(suffix_y)
> 
> How

Re: [PATCH] initramfs: Fix disabling of initramfs (and its compression)

2017-08-28 Thread Nicholas Piggin

On Mon, 28 Aug 2017 13:03:31 -0700
Florian Fainelli  wrote:

> On 05/21/2017 07:46 PM, Nicholas Piggin wrote:
> > On Sat, 20 May 2017 20:33:35 -0700
> > Florian Fainelli  wrote:
> >   
> >> Commit db2aa7fd15e8 ("initramfs: allow again choice of the embedded
> >> initram compression algorithm") introduced the possibility to select the
> >> initramfs compression algorithm from Kconfig and while this is a nice
> >> feature it broke the use case described below.
> >>
> >> Here is what my build system does:
> >>
> >> - kernel is initially configured not to have an initramfs included
> >> - build the user space root file system
> >> - re-configure the kernel to have an initramfs included
> >> (CONFIG_INITRAMFS_SOURCE="/path/to/romfs") and set relevant
> >> CONFIG_INITRAMFS options, in my case, no compression option
> >> (CONFIG_INITRAMFS_COMPRESSION_NONE)
> >> - kernel is re-built with these options -> kernel+initramfs image is
> >>   copied
> >> - kernel is re-built again without these options -> kernel image is
> >>   copied
> >>
> >> Building a kernel without an initramfs means setting this option:
> >>
> >> CONFIG_INITRAMFS_SOURCE="" (and this one only)
> >>
> >> whereas building a kernel with an initramfs means setting these options:
> >>
> >> CONFIG_INITRAMFS_SOURCE="/home/fainelli/work/uclinux-rootfs/romfs
> >> /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev"
> >> CONFIG_INITRAMFS_ROOT_UID=1000
> >> CONFIG_INITRAMFS_ROOT_GID=1000
> >> CONFIG_INITRAMFS_COMPRESSION_NONE=y
> >> CONFIG_INITRAMFS_COMPRESSION=""
> >>
> >> Commit db2aa7fd15e857891cefbada8348c8d938c7a2bc ("initramfs: allow again
> >> choice of the embedded initram compression algorithm") is problematic
> >> because CONFIG_INITRAMFS_COMPRESSION which is used to determine the
> >> initramfs_data.cpio extension/compression is a string, and due to how
> >> Kconfig works it will evaluate in order, how to assign it.
> >>
> >> Setting CONFIG_INITRAMFS_COMPRESSION_NONE with
> >> CONFIG_INITRAMFS_SOURCE="" cannot possibly work (because of the depends
> >> on INITRAMFS_SOURCE!="" imposed on CONFIG_INITRAMFS_COMPRESSION ) yet we
> >> still get CONFIG_INITRAMFS_COMPRESSION assigned to ".gz" because
> >> CONFIG_RD_GZIP=y is set in my kernel, even when there is no initramfs
> >> being built.
> >>
> >> So we basically end-up generating two initramfs_data.cpio* files, one
> >> without extension, and one with .gz. This causes usr/Makefile to track
> >> usr/initramfs_data.cpio.gz, and not usr/initramfs_data.cpio anymore,
> >> that is also largely problematic after
> >> 9e3596b0c6539e28546ff7c72a06576627068353 ("kbuild: initramfs cleanup,
> >> set target from Kconfig") because we used to track all possible
> >> initramfs_data files in the $(targets) variable before that commit.
> >>
> >> The end result is that the kernel with an initramfs clearly does not
> >> contain what we expect it to, it has a stale initramfs_data.cpio file
> >> built into it, and we keep re-generating an initramfs_data.cpio.gz file
> >> which is not the one that we want to include in the kernel image proper.
> >>
> >> The fix consists in hiding CONFIG_INITRAMFS_COMPRESSION when
> >> CONFIG_INITRAMFS_SOURCE="". This puts us back in a state to the pre-4.10
> >> behavior where we can properly disable and re-enable initramfs within
> >> the same kernel .config file, and be in control of what
> >> CONFIG_INITRAMFS_COMPRESSION is set to.
> >>
> >> Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded 
> >> initram compression algorithm")
> >> Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
> >> Signed-off-by: Florian Fainelli   
> > 
> > This is very thorough, thank you for tracking it down and fixing it.
> > 
> > I can't say I've worked through the problem in the code, but your
> > changelog and the proposed fix seem reasonable to me. So for what
> > it's worth:
> > 
> > Acked-by: Nicholas Piggin   
> 
> Well, I am looking at this again, and it's still broken, the same test
> case is involved, except this time, I am switching beween no-initramfs
> and initramfs with gzip compression (the key thing is using a
> compression of some sort). The end result is the following:
> 
> - change stuff in the rootfs
> - build the kernel with initramfs, CONFIG_INITRAMFS_COMPRESSION_GZIP=y,
> usr/initramfs_data.cpio.gz gets generated correctly the first time
> - build the kernel without initramfs,
> CONFIG_INITRAMFS_COMPRESSION_NONE=y, usr/initramfs_data.cpio gets generated
> 
> Now back to step 1 add some files, and we can see that
> usr/initramfs_data.cpio.gz is now stale from before...
> 
> So while my earlier fix switched the initramfs w/o compression to no
> initramfs rebuild, now this does not work because we still have two
> files left to be tracked:
> 
> usr/initramfs_data.cpio (no compression, or when initramfs is disabled)
> and usr/initramfs_data.cpio.$(suffix_y)
> 
> How would you go about solving this?

I don't see the problem. When I change back to

Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-08-28 Thread Wei Wang


On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:

On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:

Add a new feature, VIRTIO_BALLOON_F_SG, which enables the transfer
of balloon (i.e. inflated/deflated) pages using scatter-gather lists
to the host.

The implementation of the previous virtio-balloon is not very
efficient, because the balloon pages are transferred to the
host one by one. Here is the breakdown of the time in percentage
spent on each step of the balloon inflating process (inflating
7GB of an 8GB idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transferring pages to the host in
sgs. An sg describes a chunk of guest physically continuous pages.
With this mechanism, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

With this new feature, the above ballooning process takes ~597ms
resulting in an improvement of ~86%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
---
  drivers/virtio/virtio_balloon.c | 171 
  include/uapi/linux/virtio_balloon.h |   1 +
  2 files changed, 155 insertions(+), 17 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..8ecc1d4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -32,6 +32,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  /*

   * Balloon device works in 4K page units.  So each page is pointed to by
@@ -79,6 +81,9 @@ struct virtio_balloon {
/* Synchronize access/update to this struct virtio_balloon elements */
struct mutex balloon_lock;
  
+	/* The xbitmap used to record balloon pages */

+   struct xb page_xb;
+
/* The array of pfns we tell the Host about. */
unsigned int num_pfns;
__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
@@ -141,13 +146,111 @@ static void set_page_pfns(struct virtio_balloon *vb,
  page_to_balloon_pfn(page) + i);
  }
  
+static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)

+{
+   struct scatterlist sg;
+
+   sg_init_one(, addr, size);
+   return virtqueue_add_inbuf(vq, , 1, vq, GFP_KERNEL);
+}
+
+static void send_balloon_page_sg(struct virtio_balloon *vb,
+struct virtqueue *vq,
+void *addr,
+uint32_t size,
+bool batch)
+{
+   unsigned int len;
+   int err;
+
+   err = add_one_sg(vq, addr, size);
+   /* Sanity check: this can't really happen */
+   WARN_ON(err);

It might be cleaner to detect that add failed due to
ring full and kick then. Just an idea, up to you
whether to do it.


+
+   /* If batching is in use, we batch the sgs till the vq is full. */
+   if (!batch || !vq->num_free) {
+   virtqueue_kick(vq);
+   wait_event(vb->acked, virtqueue_get_buf(vq, ));
+   /* Release all the entries if there are */

Meaning
Account for all used entries if any
?


+   while (virtqueue_get_buf(vq, ))
+   ;


Above code is reused below. Add a function?


+   }
+}
+
+/*
+ * Send balloon pages in sgs to host. The balloon pages are recorded in the
+ * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.
+ * The page xbitmap is searched for continuous "1" bits, which correspond
+ * to continuous pages, to chunk into sgs.
+ *
+ * @page_xb_start and @page_xb_end form the range of bits in the xbitmap that
+ * need to be searched.
+ */
+static void tell_host_sgs(struct virtio_balloon *vb,
+ struct virtqueue *vq,
+ unsigned long page_xb_start,
+ unsigned long page_xb_end)
+{
+   unsigned long sg_pfn_start, sg_pfn_end;
+   void *sg_addr;
+   uint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);
+
+   sg_pfn_start = page_xb_start;
+   while (sg_pfn_start < page_xb_end) {
+   sg_pfn_start = xb_find_next_bit(>page_xb, sg_pfn_start,
+   page_xb_end, 1);
+   if (sg_pfn_start == page_xb_end + 1)
+   break;
+   sg_pfn_end = xb_find_next_bit(>page_xb, sg_pfn_start + 1,
+ page_xb_end, 0);
+   sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
+   sg_len = (sg_pfn_end - sg_pfn_start) <<

Re: [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG

2017-08-28 Thread Wei Wang


On 08/29/2017 02:03 AM, Michael S. Tsirkin wrote:

On Mon, Aug 28, 2017 at 06:08:31PM +0800, Wei Wang wrote:

Add a new feature, VIRTIO_BALLOON_F_SG, which enables the transfer
of balloon (i.e. inflated/deflated) pages using scatter-gather lists
to the host.

The implementation of the previous virtio-balloon is not very
efficient, because the balloon pages are transferred to the
host one by one. Here is the breakdown of the time in percentage
spent on each step of the balloon inflating process (inflating
7GB of an 8GB idle guest).

1) allocating pages (6.5%)
2) sending PFNs to host (68.3%)
3) address translation (6.1%)
4) madvise (19%)

It takes about 4126ms for the inflating process to complete.
The above profiling shows that the bottlenecks are stage 2)
and stage 4).

This patch optimizes step 2) by transferring pages to the host in
sgs. An sg describes a chunk of guest physically continuous pages.
With this mechanism, step 4) can also be optimized by doing address
translation and madvise() in chunks rather than page by page.

With this new feature, the above ballooning process takes ~597ms
resulting in an improvement of ~86%.

TODO: optimize stage 1) by allocating/freeing a chunk of pages
instead of a single page each time.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
---
  drivers/virtio/virtio_balloon.c | 171 
  include/uapi/linux/virtio_balloon.h |   1 +
  2 files changed, 155 insertions(+), 17 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f0b3a0b..8ecc1d4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -32,6 +32,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  /*

   * Balloon device works in 4K page units.  So each page is pointed to by
@@ -79,6 +81,9 @@ struct virtio_balloon {
/* Synchronize access/update to this struct virtio_balloon elements */
struct mutex balloon_lock;
  
+	/* The xbitmap used to record balloon pages */

+   struct xb page_xb;
+
/* The array of pfns we tell the Host about. */
unsigned int num_pfns;
__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
@@ -141,13 +146,111 @@ static void set_page_pfns(struct virtio_balloon *vb,
  page_to_balloon_pfn(page) + i);
  }
  
+static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)

+{
+   struct scatterlist sg;
+
+   sg_init_one(, addr, size);
+   return virtqueue_add_inbuf(vq, , 1, vq, GFP_KERNEL);
+}
+
+static void send_balloon_page_sg(struct virtio_balloon *vb,
+struct virtqueue *vq,
+void *addr,
+uint32_t size,
+bool batch)
+{
+   unsigned int len;
+   int err;
+
+   err = add_one_sg(vq, addr, size);
+   /* Sanity check: this can't really happen */
+   WARN_ON(err);

It might be cleaner to detect that add failed due to
ring full and kick then. Just an idea, up to you
whether to do it.


+
+   /* If batching is in use, we batch the sgs till the vq is full. */
+   if (!batch || !vq->num_free) {
+   virtqueue_kick(vq);
+   wait_event(vb->acked, virtqueue_get_buf(vq, ));
+   /* Release all the entries if there are */

Meaning
Account for all used entries if any
?


+   while (virtqueue_get_buf(vq, ))
+   ;


Above code is reused below. Add a function?


+   }
+}
+
+/*
+ * Send balloon pages in sgs to host. The balloon pages are recorded in the
+ * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.
+ * The page xbitmap is searched for continuous "1" bits, which correspond
+ * to continuous pages, to chunk into sgs.
+ *
+ * @page_xb_start and @page_xb_end form the range of bits in the xbitmap that
+ * need to be searched.
+ */
+static void tell_host_sgs(struct virtio_balloon *vb,
+ struct virtqueue *vq,
+ unsigned long page_xb_start,
+ unsigned long page_xb_end)
+{
+   unsigned long sg_pfn_start, sg_pfn_end;
+   void *sg_addr;
+   uint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);
+
+   sg_pfn_start = page_xb_start;
+   while (sg_pfn_start < page_xb_end) {
+   sg_pfn_start = xb_find_next_bit(>page_xb, sg_pfn_start,
+   page_xb_end, 1);
+   if (sg_pfn_start == page_xb_end + 1)
+   break;
+   sg_pfn_end = xb_find_next_bit(>page_xb, sg_pfn_start + 1,
+ page_xb_end, 0);
+   sg_addr = (void *)pfn_to_kaddr(sg_pfn_start);
+   sg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;
+   while (sg_len > sg_max_len) {
+

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-08-28 Thread Alexey Kardashevskiy

On 21/08/17 12:47, Alexey Kardashevskiy wrote:
> Folks,
> 
> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
> repost this or go back to PCI bus flags or something else? Thanks.


Anyone, any help? How do we proceed with this? Thanks.



> 
> 
> 
> On 14/08/17 19:45, Alexey Kardashevskiy wrote:
>> Folks,
>>
>> Is there anything to change besides those compiler errors and David's
>> comment in 5/5? Or the while patchset is too bad? Thanks.
>>
>>
>>
>> On 07/08/17 17:25, Alexey Kardashevskiy wrote:
>>> This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for 
>>> mmapping MSI-X table"
>>> http://www.spinics.net/lists/kvm/msg152232.html
>>>
>>> This time it is using "caps" in IOMMU groups. The main question is if PCI
>>> bus flags or IOMMU domains are still better (and which one).
>>
>>>
>>>
>>>
>>> Here is some background:
>>>
>>> Current vfio-pci implementation disallows to mmap the page
>>> containing MSI-X table in case that users can write directly
>>> to MSI-X table and generate an incorrect MSIs.
>>>
>>> However, this will cause some performance issue when there
>>> are some critical device registers in the same page as the
>>> MSI-X table. We have to handle the mmio access to these
>>> registers in QEMU emulation rather than in guest.
>>>
>>> To solve this issue, this series allows to expose MSI-X table
>>> to userspace when hardware enables the capability of interrupt
>>> remapping which can ensure that a given PCI device can only
>>> shoot the MSIs assigned for it. And we introduce a new bus_flags
>>> PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
>>> for different archs.
>>>
>>>
>>> This is based on sha1
>>> 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux"
>>>
>>> Please comment. Thanks.
>>>
>>> Changelog:
>>>
>>> v5:
>>> * redid the whole thing via so-called IOMMU group capabilities
>>>
>>> v4:
>>> * rebased on recent upstream
>>> * got all 6 patches from v2 (v3 was missing some)
>>>
>>>
>>>
>>>
>>> Alexey Kardashevskiy (5):
>>>   iommu: Add capabilities to a group
>>>   iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ
>>> remapping
>>>   iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is
>>> enabled
>>>   powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX
>>>   vfio-pci: Allow to expose MSI-X table to userspace when safe
>>>
>>>  include/linux/iommu.h| 20 
>>>  include/linux/vfio.h |  1 +
>>>  arch/powerpc/kernel/iommu.c  |  1 +
>>>  drivers/iommu/amd_iommu.c|  3 +++
>>>  drivers/iommu/intel-iommu.c  |  3 +++
>>>  drivers/iommu/iommu.c| 35 +++
>>>  drivers/vfio/pci/vfio_pci.c  | 20 +---
>>>  drivers/vfio/pci/vfio_pci_rdwr.c |  5 -
>>>  drivers/vfio/vfio.c  | 15 +++
>>>  9 files changed, 99 insertions(+), 4 deletions(-)
>>>
>>
>>
> 
> 


-- 
Alexey

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-08-28 Thread Alexey Kardashevskiy

On 21/08/17 12:47, Alexey Kardashevskiy wrote:
> Folks,
> 
> Ok, people did talk, exchanged ideas, lovely :) What happens now? Do I
> repost this or go back to PCI bus flags or something else? Thanks.


Anyone, any help? How do we proceed with this? Thanks.



> 
> 
> 
> On 14/08/17 19:45, Alexey Kardashevskiy wrote:
>> Folks,
>>
>> Is there anything to change besides those compiler errors and David's
>> comment in 5/5? Or the while patchset is too bad? Thanks.
>>
>>
>>
>> On 07/08/17 17:25, Alexey Kardashevskiy wrote:
>>> This is a followup for "[PATCH kernel v4 0/6] vfio-pci: Add support for 
>>> mmapping MSI-X table"
>>> http://www.spinics.net/lists/kvm/msg152232.html
>>>
>>> This time it is using "caps" in IOMMU groups. The main question is if PCI
>>> bus flags or IOMMU domains are still better (and which one).
>>
>>>
>>>
>>>
>>> Here is some background:
>>>
>>> Current vfio-pci implementation disallows to mmap the page
>>> containing MSI-X table in case that users can write directly
>>> to MSI-X table and generate an incorrect MSIs.
>>>
>>> However, this will cause some performance issue when there
>>> are some critical device registers in the same page as the
>>> MSI-X table. We have to handle the mmio access to these
>>> registers in QEMU emulation rather than in guest.
>>>
>>> To solve this issue, this series allows to expose MSI-X table
>>> to userspace when hardware enables the capability of interrupt
>>> remapping which can ensure that a given PCI device can only
>>> shoot the MSIs assigned for it. And we introduce a new bus_flags
>>> PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
>>> for different archs.
>>>
>>>
>>> This is based on sha1
>>> 26c5cebfdb6c "Merge branch 'parisc-4.13-4' of 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux"
>>>
>>> Please comment. Thanks.
>>>
>>> Changelog:
>>>
>>> v5:
>>> * redid the whole thing via so-called IOMMU group capabilities
>>>
>>> v4:
>>> * rebased on recent upstream
>>> * got all 6 patches from v2 (v3 was missing some)
>>>
>>>
>>>
>>>
>>> Alexey Kardashevskiy (5):
>>>   iommu: Add capabilities to a group
>>>   iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if MSI controller enables IRQ
>>> remapping
>>>   iommu/intel/amd: Set IOMMU_GROUP_CAP_ISOLATE_MSIX if IRQ remapping is
>>> enabled
>>>   powerpc/iommu: Set IOMMU_GROUP_CAP_ISOLATE_MSIX
>>>   vfio-pci: Allow to expose MSI-X table to userspace when safe
>>>
>>>  include/linux/iommu.h| 20 
>>>  include/linux/vfio.h |  1 +
>>>  arch/powerpc/kernel/iommu.c  |  1 +
>>>  drivers/iommu/amd_iommu.c|  3 +++
>>>  drivers/iommu/intel-iommu.c  |  3 +++
>>>  drivers/iommu/iommu.c| 35 +++
>>>  drivers/vfio/pci/vfio_pci.c  | 20 +---
>>>  drivers/vfio/pci/vfio_pci_rdwr.c |  5 -
>>>  drivers/vfio/vfio.c  | 15 +++
>>>  9 files changed, 99 insertions(+), 4 deletions(-)
>>>
>>
>>
> 
> 


-- 
Alexey

Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Hoegeun Kwon


Hi Krzysztof,

The driver has been merged into exynos-drm-misc.
Could you please check this patch(3/3).

Best regards,
Hoegeun

On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:

The display-timing and delay are included in the panel driver. So it
should be removed in dts.

Signed-off-by: Hoegeun Kwon 
---
  arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
  1 file changed, 22 deletions(-)

diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts 
b/arch/arm/boot/dts/exynos3250-rinato.dts
index 443e0c9..6b70c8d 100644
--- a/arch/arm/boot/dts/exynos3250-rinato.dts
+++ b/arch/arm/boot/dts/exynos3250-rinato.dts
@@ -242,28 +242,6 @@
vci-supply = <_reg>;
reset-gpios = < 1 GPIO_ACTIVE_LOW>;
te-gpios = < 6 GPIO_ACTIVE_HIGH>;
-   power-on-delay= <30>;
-   power-off-delay= <120>;
-   reset-delay = <5>;
-   init-delay = <100>;
-   flip-horizontal;
-   flip-vertical;
-   panel-width-mm = <29>;
-   panel-height-mm = <29>;
-
-   display-timings {
-   timing-0 {
-   clock-frequency = <460>;
-   hactive = <320>;
-   vactive = <320>;
-   hfront-porch = <1>;
-   hback-porch = <1>;
-   hsync-len = <1>;
-   vfront-porch = <150>;
-   vback-porch = <1>;
-   vsync-len = <2>;
-   };
-   };
  
  		port {

dsi_in: endpoint {

Re: [PATCH v4 3/3] ARM: dts: exynos: Remove the display-timing and delay from rinato dts

2017-08-28 Thread Hoegeun Kwon


Hi Krzysztof,

The driver has been merged into exynos-drm-misc.
Could you please check this patch(3/3).

Best regards,
Hoegeun

On 07/13/2017 11:20 AM, Hoegeun Kwon wrote:

The display-timing and delay are included in the panel driver. So it
should be removed in dts.

Signed-off-by: Hoegeun Kwon 
---
  arch/arm/boot/dts/exynos3250-rinato.dts | 22 --
  1 file changed, 22 deletions(-)

diff --git a/arch/arm/boot/dts/exynos3250-rinato.dts 
b/arch/arm/boot/dts/exynos3250-rinato.dts
index 443e0c9..6b70c8d 100644
--- a/arch/arm/boot/dts/exynos3250-rinato.dts
+++ b/arch/arm/boot/dts/exynos3250-rinato.dts
@@ -242,28 +242,6 @@
vci-supply = <_reg>;
reset-gpios = < 1 GPIO_ACTIVE_LOW>;
te-gpios = < 6 GPIO_ACTIVE_HIGH>;
-   power-on-delay= <30>;
-   power-off-delay= <120>;
-   reset-delay = <5>;
-   init-delay = <100>;
-   flip-horizontal;
-   flip-vertical;
-   panel-width-mm = <29>;
-   panel-height-mm = <29>;
-
-   display-timings {
-   timing-0 {
-   clock-frequency = <460>;
-   hactive = <320>;
-   vactive = <320>;
-   hfront-porch = <1>;
-   hback-porch = <1>;
-   hsync-len = <1>;
-   vfront-porch = <150>;
-   vback-porch = <1>;
-   vsync-len = <2>;
-   };
-   };
  
  		port {

dsi_in: endpoint {

Re: linux-next: Signed-off-by missing for commit in the slave-dma tree

2017-08-28 Thread Vinod Koul

On Tue, Aug 29, 2017 at 09:10:56AM +1000, Stephen Rothwell wrote:
> Hi Vinod,
> 
> Commit
> 
>   966e5e01f420 ("dmaengine: altera: Use macros instead of structs to describe 
> the registers")
> 
> is missing a Signed-off-by from its committer.

Oops, missed fixing that up while running with -i :(.

Fixed now, thanks for pointing out

-- 
~Vinod

Re: linux-next: Signed-off-by missing for commit in the slave-dma tree

2017-08-28 Thread Vinod Koul

On Tue, Aug 29, 2017 at 09:10:56AM +1000, Stephen Rothwell wrote:
> Hi Vinod,
> 
> Commit
> 
>   966e5e01f420 ("dmaengine: altera: Use macros instead of structs to describe 
> the registers")
> 
> is missing a Signed-off-by from its committer.

Oops, missed fixing that up while running with -i :(.

Fixed now, thanks for pointing out

-- 
~Vinod

[PATCH] docs: highres: fix broken urls

2017-08-28 Thread stephen lu

Some urls is invalid. I find alternative urls.

Signed-off-by: stephen lu 
---
 Documentation/timers/highres.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/timers/highres.txt b/Documentation/timers/highres.txt
index e878997..9d88f67 100644
--- a/Documentation/timers/highres.txt
+++ b/Documentation/timers/highres.txt
@@ -4,10 +4,10 @@ High resolution timers and dynamic ticks design notes
 Further information can be found in the paper of the OLS 2006 talk "hrtimers
 and beyond". The paper is part of the OLS 2006 Proceedings Volume 1, which can
 be found on the OLS website:
-http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf
+https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346.pdf

 The slides to this talk are available from:
-http://tglx.de/projects/hrtimers/ols2006-hrtimers.pdf
+http://www.cs.columbia.edu/~nahum/w6998/papers/ols2006-hrtimers-slides.pdf

 The slides contain five figures (pages 2, 15, 18, 20, 22), which illustrate the
 changes in the time(r) related Linux subsystems. Figure #1 (p. 2) shows the

[PATCH] docs: highres: fix broken urls

2017-08-28 Thread stephen lu

Some urls is invalid. I find alternative urls.

Signed-off-by: stephen lu 
---
 Documentation/timers/highres.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/timers/highres.txt b/Documentation/timers/highres.txt
index e878997..9d88f67 100644
--- a/Documentation/timers/highres.txt
+++ b/Documentation/timers/highres.txt
@@ -4,10 +4,10 @@ High resolution timers and dynamic ticks design notes
 Further information can be found in the paper of the OLS 2006 talk "hrtimers
 and beyond". The paper is part of the OLS 2006 Proceedings Volume 1, which can
 be found on the OLS website:
-http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf
+https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346.pdf

 The slides to this talk are available from:
-http://tglx.de/projects/hrtimers/ols2006-hrtimers.pdf
+http://www.cs.columbia.edu/~nahum/w6998/papers/ols2006-hrtimers-slides.pdf

 The slides contain five figures (pages 2, 15, 18, 20, 22), which illustrate the
 changes in the time(r) related Linux subsystems. Figure #1 (p. 2) shows the

RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default

2017-08-28 Thread Dexuan Cui

> From: Dexuan Cui
> Sent: Tuesday, August 22, 2017 21:21
> > ...
> > ...
> > The only problem here would be the potential for a guest and a host app
> to
> > have a conflict wrt port numbers, even though they would be able to
> > operate fine, if restricted to their appropriate transport.
> >
> > Thanks,
> > Jorgen
> 
> Hi Jorgen, Stefan,
> Thank you for the detailed analysis!
> You have a much better understanding than me about the complex
> scenarios. Can you please work out a patch? :-)

Hi Jorgen, Stefan,
May I know your plan for this? 
 
> IMO Linux driver of Hyper-V sockets is the simplest case, as we only have
> the "to host" option (the host side driver of Hyper-V sockets runs on
> Windows kernel and I don't think the other hypervisors emulate
> the full Hyper-V VMBus 4.0, which is required to support Hyper-V sockets).
> 
> -- Dexuan

Thanks,
-- Dexuan

RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default

2017-08-28 Thread Dexuan Cui

> From: Dexuan Cui
> Sent: Tuesday, August 22, 2017 21:21
> > ...
> > ...
> > The only problem here would be the potential for a guest and a host app
> to
> > have a conflict wrt port numbers, even though they would be able to
> > operate fine, if restricted to their appropriate transport.
> >
> > Thanks,
> > Jorgen
> 
> Hi Jorgen, Stefan,
> Thank you for the detailed analysis!
> You have a much better understanding than me about the complex
> scenarios. Can you please work out a patch? :-)

Hi Jorgen, Stefan,
May I know your plan for this? 
 
> IMO Linux driver of Hyper-V sockets is the simplest case, as we only have
> the "to host" option (the host side driver of Hyper-V sockets runs on
> Windows kernel and I don't think the other hypervisors emulate
> the full Hyper-V VMBus 4.0, which is required to support Hyper-V sockets).
> 
> -- Dexuan

Thanks,
-- Dexuan

Re: module: Remove const attribute from alias for MODULE_DEVICE_TABLE

2017-08-28 Thread Stefan Agner

On 2017-08-28 10:41, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 10:38 AM, Nick Desaulniers
>  wrote:
>> I think Kees' proposal is a better solution; rather than require all
>> usage of device table to remember to add const, have the macro add it
>> for all users.  Otherwise if you require caller's to add it, they may
>> forget.
> 
> And with the coccinelle script, it should be easy to invert the logic
> and remove const from the callers...
> 

I tried to reproduce my findings again but was not successful :-( I must
have changed .config or something in between and draw wrong
conclusions...

So removing the const in the module.h alias actually did not change
anything... It did not help for drivers which forget to constify... I
think even the alias in module.h was actually illegal according to C
standard:

(C89, 6.2.7p2) "All declarations that refer to the same object or
function shall have compatible type; otherwise the behavior is
undefined."

I guess it would still make sense to constify the structs for most of
the 620 drivers which do not have it const currently. I found some
driver actually change the table at runtime, e.g.
drivers/net/usb/pegasus.c, so we would have to exclude them.

--
Stefan

Re: module: Remove const attribute from alias for MODULE_DEVICE_TABLE

2017-08-28 Thread Stefan Agner

On 2017-08-28 10:41, Kees Cook wrote:
> On Mon, Aug 28, 2017 at 10:38 AM, Nick Desaulniers
>  wrote:
>> I think Kees' proposal is a better solution; rather than require all
>> usage of device table to remember to add const, have the macro add it
>> for all users.  Otherwise if you require caller's to add it, they may
>> forget.
> 
> And with the coccinelle script, it should be easy to invert the logic
> and remove const from the callers...
> 

I tried to reproduce my findings again but was not successful :-( I must
have changed .config or something in between and draw wrong
conclusions...

So removing the const in the module.h alias actually did not change
anything... It did not help for drivers which forget to constify... I
think even the alias in module.h was actually illegal according to C
standard:

(C89, 6.2.7p2) "All declarations that refer to the same object or
function shall have compatible type; otherwise the behavior is
undefined."

I guess it would still make sense to constify the structs for most of
the 620 drivers which do not have it const currently. I found some
driver actually change the table at runtime, e.g.
drivers/net/usb/pegasus.c, so we would have to exclude them.

--
Stefan

linux-next: manual merge of the net-next tree with the net tree

2017-08-28 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/ethernet/marvell/mvpp2.c

between commit:

  4c2286826451 ("net: mvpp2: fix the mac address used when using PPv2.2")

from the net tree and commits:

  09f8397553a2 ("net: mvpp2: introduce per-port nrxqs/ntxqs variables")
  213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue 
distribution modes")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/ethernet/marvell/mvpp2.c
index 4d598ca8503a,fea9ae5b70ba..
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@@ -6504,7 -7248,9 +7248,9 @@@ static int mvpp2_port_probe(struct plat
struct resource *res;
const char *dt_mac_addr;
const char *mac_from;
 -  char hw_mac_addr[ETH_ALEN];
 +  char hw_mac_addr[ETH_ALEN] = {0};
+   unsigned int ntxqs, nrxqs;
+   bool has_tx_irqs;
u32 id;
int features;
int phy_mode;

linux-next: manual merge of the net-next tree with the net tree

2017-08-28 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/ethernet/marvell/mvpp2.c

between commit:

  4c2286826451 ("net: mvpp2: fix the mac address used when using PPv2.2")

from the net tree and commits:

  09f8397553a2 ("net: mvpp2: introduce per-port nrxqs/ntxqs variables")
  213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue 
distribution modes")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/ethernet/marvell/mvpp2.c
index 4d598ca8503a,fea9ae5b70ba..
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@@ -6504,7 -7248,9 +7248,9 @@@ static int mvpp2_port_probe(struct plat
struct resource *res;
const char *dt_mac_addr;
const char *mac_from;
 -  char hw_mac_addr[ETH_ALEN];
 +  char hw_mac_addr[ETH_ALEN] = {0};
+   unsigned int ntxqs, nrxqs;
+   bool has_tx_irqs;
u32 id;
int features;
int phy_mode;

Re: [PATCH v3 4/5] input: Add MediaTek PMIC keys support

2017-08-28 Thread Chen Zhong

On Mon, 2017-08-28 at 09:57 -0700, Dmitry Torokhov wrote:
> Hi Chen,
> 
> On Fri, Aug 25, 2017 at 02:32:32PM +0800, Chen Zhong wrote:
> > +static int mtk_pmic_key_setup(struct mtk_pmic_keys *keys,
> > +   struct pmic_keys_info *info)
> > +{
> > +   int ret;
> > +
> > +   info->keys = keys;
> > +
> > +   ret = regmap_update_bits(keys->regmap, info->regs->intsel_reg,
> > +info->regs->intsel_mask,
> > +info->regs->intsel_mask);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   ret = devm_request_threaded_irq(keys->dev, info->irq, NULL,
> > +   mtk_pmic_keys_irq_handler_thread,
> > +   IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
> > +   "mtk-pmic-keys", info);
> > +   if (ret) {
> > +   dev_err(keys->dev, "Failed to request IRQ: %d: %d\n",
> > +   info->irq, ret);
> > +   return ret;
> > +   }
> > +
> > +   if (info->wakeup)
> > +   irq_set_irq_wake(info->irq, 1);
> 
> Normally we do this in suspend() (and undo in resume()), and I believe
> the drover API is enable_irq_wake() and disable_irq_wake().
> 

Hi Dmitry,

I'll add suspend/resume callback functions and do this with
enable_irq_wake() and disable_irq_wake().

Thank you.


> Thanks.
>

Re: [PATCH v3 4/5] input: Add MediaTek PMIC keys support

2017-08-28 Thread Chen Zhong

On Mon, 2017-08-28 at 09:57 -0700, Dmitry Torokhov wrote:
> Hi Chen,
> 
> On Fri, Aug 25, 2017 at 02:32:32PM +0800, Chen Zhong wrote:
> > +static int mtk_pmic_key_setup(struct mtk_pmic_keys *keys,
> > +   struct pmic_keys_info *info)
> > +{
> > +   int ret;
> > +
> > +   info->keys = keys;
> > +
> > +   ret = regmap_update_bits(keys->regmap, info->regs->intsel_reg,
> > +info->regs->intsel_mask,
> > +info->regs->intsel_mask);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   ret = devm_request_threaded_irq(keys->dev, info->irq, NULL,
> > +   mtk_pmic_keys_irq_handler_thread,
> > +   IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
> > +   "mtk-pmic-keys", info);
> > +   if (ret) {
> > +   dev_err(keys->dev, "Failed to request IRQ: %d: %d\n",
> > +   info->irq, ret);
> > +   return ret;
> > +   }
> > +
> > +   if (info->wakeup)
> > +   irq_set_irq_wake(info->irq, 1);
> 
> Normally we do this in suspend() (and undo in resume()), and I believe
> the drover API is enable_irq_wake() and disable_irq_wake().
> 

Hi Dmitry,

I'll add suspend/resume callback functions and do this with
enable_irq_wake() and disable_irq_wake().

Thank you.


> Thanks.
>

[PATCH v7 07/11] sparc64: optimized struct page zeroing

2017-08-28 Thread Pavel Tatashin

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

   BASEFIX  OPTIMIZED_FIX
bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
  
Total 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)   (mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#definemm_zero_struct_page(pp) do {
\
+   unsigned long *_pp = (void *)(pp);  \
+   \
+/* Check that struct page is either 64, 72, or 80 bytes */ \
+   BUILD_BUG_ON(sizeof(struct page) & 7);  \
+   BUILD_BUG_ON(sizeof(struct page) < 64); \
+   BUILD_BUG_ON(sizeof(struct page) > 80); \
+   \
+   switch (sizeof(struct page)) {  \
+   case 80:\
+   _pp[9] = 0; /* fallthrough */   \
+   case 72:\
+   _pp[8] = 0; /* fallthrough */   \
+   default:\
+   _pp[7] = 0; \
+   _pp[6] = 0; \
+   _pp[5] = 0; \
+   _pp[4] = 0; \
+   _pp[3] = 0; \
+   _pp[2] = 0; \
+   _pp[1] = 0; \
+   _pp[0] = 0; \
+   }   \
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1

[PATCH v7 07/11] sparc64: optimized struct page zeroing

2017-08-28 Thread Pavel Tatashin

Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to ten regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

   BASEFIX  OPTIMIZED_FIX
bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
  
Total 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation. Note, about two seconds in this function is a fixed time: it
does not increase as memory is increased.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)   (mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#definemm_zero_struct_page(pp) do {
\
+   unsigned long *_pp = (void *)(pp);  \
+   \
+/* Check that struct page is either 64, 72, or 80 bytes */ \
+   BUILD_BUG_ON(sizeof(struct page) & 7);  \
+   BUILD_BUG_ON(sizeof(struct page) < 64); \
+   BUILD_BUG_ON(sizeof(struct page) > 80); \
+   \
+   switch (sizeof(struct page)) {  \
+   case 80:\
+   _pp[9] = 0; /* fallthrough */   \
+   case 72:\
+   _pp[8] = 0; /* fallthrough */   \
+   default:\
+   _pp[7] = 0; \
+   _pp[6] = 0; \
+   _pp[5] = 0; \
+   _pp[4] = 0; \
+   _pp[3] = 0; \
+   _pp[2] = 0; \
+   _pp[1] = 0; \
+   _pp[0] = 0; \
+   }   \
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,
-- 
2.14.1

[PATCH v7 00/11] complete deferred page initialization

2017-08-28 Thread Pavel Tatashin

Changelog:
v7 - v6
- Addressed comments from Michal Hocko
- memblock_discard() patch was removed from this series and integrated
  separately
- Fixed bug reported by kbuild test robot new patch:
  mm: zero reserved and unavailable struct pages
- Removed patch
  x86/mm: reserve only exiting low pages
  As, it is not needed anymore, because of the previous fix
- Re-wrote deferred_init_memmap(), found and fixed an existing bug, where
  page variable is not reset when zone holes present.
- Merged several patches together per Michal request
- Added performance data including raw logs

v6 - v5
- Fixed ARM64 + kasan code, as reported by Ard Biesheuvel
- Tested ARM64 code in qemu and found few more issues, that I fixed in this
  iteration
- Added page roundup/rounddown to x86 and arm zeroing routines to zero the
  whole allocated range, instead of only provided address range.
- Addressed SPARC related comment from Sam Ravnborg
- Fixed section mismatch warnings related to memblock_discard().

v5 - v4
- Fixed build issues reported by kbuild on various configurations

v4 - v3
- Rewrote code to zero sturct pages in __init_single_page() as
  suggested by Michal Hocko
- Added code to handle issues related to accessing struct page
  memory before they are initialized.

v3 - v2
- Addressed David Miller comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch

v2 - v1
- Per request, added s390 to deferred "struct page" zeroing
- Collected performance data on x86 which proofs the importance to
  keep memset() as prefetch (see below).

SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option,
which defers initializing struct pages until all cpus have been started so
it can be done in parallel.

However, this feature is sub-optimal, because the deferred page
initialization code expects that the struct pages have already been zeroed,
and the zeroing is done early in boot with a single thread only.  Also, we
access that memory and set flags before struct pages are initialized. All
of this is fixed in this patchset.

In this work we do the following:
- Never read access struct page until it was initialized
- Never set any fields in struct pages before they are initialized
- Zero struct page at the beginning of struct page initialization


==
Performance improvements on x86 machine with 8 nodes:
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz and 1T of memory:
TIME  SPEED UP
base no deferred:   95.796233s
fix no deferred:79.978956s19.77%

base deferred:  77.254713s
fix deferred:   55.050509s40.34%
==
SPARC M6 3600 MHz with 15T of memory
TIME  SPEED UP
base no deferred:   358.335727s
fix no deferred:302.320936s   18.52%

base deferred:  237.534603s
fix deferred:   182.103003s   30.44%
==
Raw dmesg output with timestamps:
x86 base no deferred:https://hastebin.com/ofunepurit.scala
x86 base deferred:   https://hastebin.com/ifazegeyas.scala
x86 fix no deferred: https://hastebin.com/pegocohevo.scala
x86 fix deferred:https://hastebin.com/ofupevikuk.scala
sparc base no deferred:  https://hastebin.com/ibobeteken.go
sparc base deferred: https://hastebin.com/fariqimiyu.go
sparc fix no deferred:   https://hastebin.com/muhegoheyi.go
sparc fix deferred:  https://hastebin.com/xadinobutu.go

Pavel Tatashin (11):
  x86/mm: setting fields in deferred pages
  sparc64/mm: setting fields in deferred pages
  mm: deferred_init_memmap improvements
  sparc64: simplify vmemmap_populate
  mm: defining memblock_virt_alloc_try_nid_raw
  mm: zero struct pages during initialization
  sparc64: optimized struct page zeroing
  mm: zero reserved and unavailable struct pages
  x86/kasan: explicitly zero kasan shadow memory
  arm64/kasan: explicitly zero kasan shadow memory
  mm: stop zeroing memory during allocation in vmemmap

 arch/arm64/mm/kasan_init.c  |  42 
 arch/sparc/include/asm/pgtable_64.h |  30 ++
 arch/sparc/mm/init_64.c |  31 +++---
 arch/x86/mm/init_64.c   |   9 +-
 arch/x86/mm/kasan_init_64.c |  66 
 include/linux/bootmem.h |  27 +
 include/linux/memblock.h|  16 +++
 include/linux/mm.h  |  26 +
 mm/memblock.c   |  60 +--
 mm/page_alloc.c | 207 
 mm/sparse-vmemmap.c |  14 +--
 mm/sparse.c |   6 +-
 12 files changed, 406 insertions(+), 128 deletions(-)

-- 
2.14.1

[PATCH v7 00/11] complete deferred page initialization

2017-08-28 Thread Pavel Tatashin

Changelog:
v7 - v6
- Addressed comments from Michal Hocko
- memblock_discard() patch was removed from this series and integrated
  separately
- Fixed bug reported by kbuild test robot new patch:
  mm: zero reserved and unavailable struct pages
- Removed patch
  x86/mm: reserve only exiting low pages
  As, it is not needed anymore, because of the previous fix
- Re-wrote deferred_init_memmap(), found and fixed an existing bug, where
  page variable is not reset when zone holes present.
- Merged several patches together per Michal request
- Added performance data including raw logs

v6 - v5
- Fixed ARM64 + kasan code, as reported by Ard Biesheuvel
- Tested ARM64 code in qemu and found few more issues, that I fixed in this
  iteration
- Added page roundup/rounddown to x86 and arm zeroing routines to zero the
  whole allocated range, instead of only provided address range.
- Addressed SPARC related comment from Sam Ravnborg
- Fixed section mismatch warnings related to memblock_discard().

v5 - v4
- Fixed build issues reported by kbuild on various configurations

v4 - v3
- Rewrote code to zero sturct pages in __init_single_page() as
  suggested by Michal Hocko
- Added code to handle issues related to accessing struct page
  memory before they are initialized.

v3 - v2
- Addressed David Miller comments about one change per patch:
* Splited changes to platforms into 4 patches
* Made "do not zero vmemmap_buf" as a separate patch

v2 - v1
- Per request, added s390 to deferred "struct page" zeroing
- Collected performance data on x86 which proofs the importance to
  keep memset() as prefetch (see below).

SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option,
which defers initializing struct pages until all cpus have been started so
it can be done in parallel.

However, this feature is sub-optimal, because the deferred page
initialization code expects that the struct pages have already been zeroed,
and the zeroing is done early in boot with a single thread only.  Also, we
access that memory and set flags before struct pages are initialized. All
of this is fixed in this patchset.

In this work we do the following:
- Never read access struct page until it was initialized
- Never set any fields in struct pages before they are initialized
- Zero struct page at the beginning of struct page initialization


==
Performance improvements on x86 machine with 8 nodes:
Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz and 1T of memory:
TIME  SPEED UP
base no deferred:   95.796233s
fix no deferred:79.978956s19.77%

base deferred:  77.254713s
fix deferred:   55.050509s40.34%
==
SPARC M6 3600 MHz with 15T of memory
TIME  SPEED UP
base no deferred:   358.335727s
fix no deferred:302.320936s   18.52%

base deferred:  237.534603s
fix deferred:   182.103003s   30.44%
==
Raw dmesg output with timestamps:
x86 base no deferred:https://hastebin.com/ofunepurit.scala
x86 base deferred:   https://hastebin.com/ifazegeyas.scala
x86 fix no deferred: https://hastebin.com/pegocohevo.scala
x86 fix deferred:https://hastebin.com/ofupevikuk.scala
sparc base no deferred:  https://hastebin.com/ibobeteken.go
sparc base deferred: https://hastebin.com/fariqimiyu.go
sparc fix no deferred:   https://hastebin.com/muhegoheyi.go
sparc fix deferred:  https://hastebin.com/xadinobutu.go

Pavel Tatashin (11):
  x86/mm: setting fields in deferred pages
  sparc64/mm: setting fields in deferred pages
  mm: deferred_init_memmap improvements
  sparc64: simplify vmemmap_populate
  mm: defining memblock_virt_alloc_try_nid_raw
  mm: zero struct pages during initialization
  sparc64: optimized struct page zeroing
  mm: zero reserved and unavailable struct pages
  x86/kasan: explicitly zero kasan shadow memory
  arm64/kasan: explicitly zero kasan shadow memory
  mm: stop zeroing memory during allocation in vmemmap

 arch/arm64/mm/kasan_init.c  |  42 
 arch/sparc/include/asm/pgtable_64.h |  30 ++
 arch/sparc/mm/init_64.c |  31 +++---
 arch/x86/mm/init_64.c   |   9 +-
 arch/x86/mm/kasan_init_64.c |  66 
 include/linux/bootmem.h |  27 +
 include/linux/memblock.h|  16 +++
 include/linux/mm.h  |  26 +
 mm/memblock.c   |  60 +--
 mm/page_alloc.c | 207 
 mm/sparse-vmemmap.c |  14 +--
 mm/sparse.c |   6 +-
 12 files changed, 406 insertions(+), 128 deletions(-)

-- 
2.14.1

[PATCH v7 09/11] x86/kasan: explicitly zero kasan shadow memory

2017-08-28 Thread Pavel Tatashin

To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.

We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/x86/mm/kasan_init_64.c | 66 +
 1 file changed, 66 insertions(+)

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 02c9d7553409..96fde5bf9597 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -84,6 +84,66 @@ static struct notifier_block kasan_die_notifier = {
 };
 #endif
 
+/*
+ * x86 variant of vmemmap_populate() uses either PMD_SIZE pages or base pages
+ * to map allocated memory.  This routine determines the page size for the 
given
+ * address from vmemmap.
+ */
+static u64 get_vmemmap_pgsz(u64 addr)
+{
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+
+   pgd = pgd_offset_k(addr);
+   BUG_ON(pgd_none(*pgd) || pgd_large(*pgd));
+
+   p4d = p4d_offset(pgd, addr);
+   BUG_ON(p4d_none(*p4d) || p4d_large(*p4d));
+
+   pud = pud_offset(p4d, addr);
+   BUG_ON(pud_none(*pud) || pud_large(*pud));
+
+   pmd = pmd_offset(pud, addr);
+   BUG_ON(pmd_none(*pmd));
+
+   if (pmd_large(*pmd))
+   return PMD_SIZE;
+   return PAGE_SIZE;
+}
+
+/*
+ * Memory that was allocated by vmemmap_populate is not zeroed, so we must
+ * zero it here explicitly.
+ */
+static void
+zero_vmemmap_populated_memory(void)
+{
+   u64 i, start, end;
+
+   for (i = 0; i < E820_MAX_ENTRIES && pfn_mapped[i].end; i++) {
+   void *kaddr_start = pfn_to_kaddr(pfn_mapped[i].start);
+   void *kaddr_end = pfn_to_kaddr(pfn_mapped[i].end);
+
+   start = (u64)kasan_mem_to_shadow(kaddr_start);
+   end = (u64)kasan_mem_to_shadow(kaddr_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+   }
+
+   start = (u64)kasan_mem_to_shadow(_stext);
+   end = (u64)kasan_mem_to_shadow(_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+}
+
 void __init kasan_early_init(void)
 {
int i;
@@ -146,6 +206,12 @@ void __init kasan_init(void)
load_cr3(init_top_pgt);
__flush_tlb_all();
 
+   /*
+* vmemmap_populate does not zero the memory, so we need to zero it
+* explicitly
+*/
+   zero_vmemmap_populated_memory();
+
/*
 * kasan_zero_page has been used as early shadow memory, thus it may
 * contain some garbage. Now we can clear and write protect it, since
-- 
2.14.1

[PATCH v7 09/11] x86/kasan: explicitly zero kasan shadow memory

2017-08-28 Thread Pavel Tatashin

To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.

We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/x86/mm/kasan_init_64.c | 66 +
 1 file changed, 66 insertions(+)

diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 02c9d7553409..96fde5bf9597 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -84,6 +84,66 @@ static struct notifier_block kasan_die_notifier = {
 };
 #endif
 
+/*
+ * x86 variant of vmemmap_populate() uses either PMD_SIZE pages or base pages
+ * to map allocated memory.  This routine determines the page size for the 
given
+ * address from vmemmap.
+ */
+static u64 get_vmemmap_pgsz(u64 addr)
+{
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+
+   pgd = pgd_offset_k(addr);
+   BUG_ON(pgd_none(*pgd) || pgd_large(*pgd));
+
+   p4d = p4d_offset(pgd, addr);
+   BUG_ON(p4d_none(*p4d) || p4d_large(*p4d));
+
+   pud = pud_offset(p4d, addr);
+   BUG_ON(pud_none(*pud) || pud_large(*pud));
+
+   pmd = pmd_offset(pud, addr);
+   BUG_ON(pmd_none(*pmd));
+
+   if (pmd_large(*pmd))
+   return PMD_SIZE;
+   return PAGE_SIZE;
+}
+
+/*
+ * Memory that was allocated by vmemmap_populate is not zeroed, so we must
+ * zero it here explicitly.
+ */
+static void
+zero_vmemmap_populated_memory(void)
+{
+   u64 i, start, end;
+
+   for (i = 0; i < E820_MAX_ENTRIES && pfn_mapped[i].end; i++) {
+   void *kaddr_start = pfn_to_kaddr(pfn_mapped[i].start);
+   void *kaddr_end = pfn_to_kaddr(pfn_mapped[i].end);
+
+   start = (u64)kasan_mem_to_shadow(kaddr_start);
+   end = (u64)kasan_mem_to_shadow(kaddr_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+   }
+
+   start = (u64)kasan_mem_to_shadow(_stext);
+   end = (u64)kasan_mem_to_shadow(_end);
+
+   /* Round to the start end of the mapped pages */
+   start = rounddown(start, get_vmemmap_pgsz(start));
+   end = roundup(end, get_vmemmap_pgsz(start));
+   memset((void *)start, 0, end - start);
+}
+
 void __init kasan_early_init(void)
 {
int i;
@@ -146,6 +206,12 @@ void __init kasan_init(void)
load_cr3(init_top_pgt);
__flush_tlb_all();
 
+   /*
+* vmemmap_populate does not zero the memory, so we need to zero it
+* explicitly
+*/
+   zero_vmemmap_populated_memory();
+
/*
 * kasan_zero_page has been used as early shadow memory, thus it may
 * contain some garbage. Now we can clear and write protect it, since
-- 
2.14.1

[PATCH v7 04/11] sparc64: simplify vmemmap_populate

2017-08-28 Thread Pavel Tatashin

Remove duplicating code by using common functions
vmemmap_pud_populate and vmemmap_pgd_populate.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/mm/init_64.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 12dbba85a2e2..a603d2c9087d 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2611,30 +2611,19 @@ int __meminit vmemmap_populate(unsigned long vstart, 
unsigned long vend,
vstart = vstart & PMD_MASK;
vend = ALIGN(vend, PMD_SIZE);
for (; vstart < vend; vstart += PMD_SIZE) {
-   pgd_t *pgd = pgd_offset_k(vstart);
+   pgd_t *pgd = vmemmap_pgd_populate(vstart, node);
unsigned long pte;
pud_t *pud;
pmd_t *pmd;
 
-   if (pgd_none(*pgd)) {
-   pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
+   if (!pgd)
+   return -ENOMEM;
 
-   if (!new)
-   return -ENOMEM;
-   pgd_populate(_mm, pgd, new);
-   }
-
-   pud = pud_offset(pgd, vstart);
-   if (pud_none(*pud)) {
-   pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
-
-   if (!new)
-   return -ENOMEM;
-   pud_populate(_mm, pud, new);
-   }
+   pud = vmemmap_pud_populate(pgd, vstart, node);
+   if (!pud)
+   return -ENOMEM;
 
pmd = pmd_offset(pud, vstart);
-
pte = pmd_val(*pmd);
if (!(pte & _PAGE_VALID)) {
void *block = vmemmap_alloc_block(PMD_SIZE, node);
-- 
2.14.1

[PATCH v7 04/11] sparc64: simplify vmemmap_populate

2017-08-28 Thread Pavel Tatashin

Remove duplicating code by using common functions
vmemmap_pud_populate and vmemmap_pgd_populate.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/mm/init_64.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 12dbba85a2e2..a603d2c9087d 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2611,30 +2611,19 @@ int __meminit vmemmap_populate(unsigned long vstart, 
unsigned long vend,
vstart = vstart & PMD_MASK;
vend = ALIGN(vend, PMD_SIZE);
for (; vstart < vend; vstart += PMD_SIZE) {
-   pgd_t *pgd = pgd_offset_k(vstart);
+   pgd_t *pgd = vmemmap_pgd_populate(vstart, node);
unsigned long pte;
pud_t *pud;
pmd_t *pmd;
 
-   if (pgd_none(*pgd)) {
-   pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
+   if (!pgd)
+   return -ENOMEM;
 
-   if (!new)
-   return -ENOMEM;
-   pgd_populate(_mm, pgd, new);
-   }
-
-   pud = pud_offset(pgd, vstart);
-   if (pud_none(*pud)) {
-   pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
-
-   if (!new)
-   return -ENOMEM;
-   pud_populate(_mm, pud, new);
-   }
+   pud = vmemmap_pud_populate(pgd, vstart, node);
+   if (!pud)
+   return -ENOMEM;
 
pmd = pmd_offset(pud, vstart);
-
pte = pmd_val(*pmd);
if (!(pte & _PAGE_VALID)) {
void *block = vmemmap_alloc_block(PMD_SIZE, node);
-- 
2.14.1

[PATCH v7 08/11] mm: zero reserved and unavailable struct pages

2017-08-28 Thread Pavel Tatashin

Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not initialized by going
through __init_single_page().

In some cases these struct pages are accessed even if they do not contain
any data. One example is page_to_pfn() might access page->flags if this is
where section information is stored (CONFIG_SPARSEMEM,
SECTION_IN_PAGE_FLAGS).

Since, struct pages are zeroed in __init_single_page(), and not during
allocation time, we must zero such struct pages explicitly.

The patch involves adding a new memblock iterator:
for_each_resv_unavail_range(i, p_start, p_end)

Which iterates through reserved && !memory lists, and we zero struct pages
explicitly by calling mm_zero_struct_page().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 include/linux/memblock.h | 16 
 include/linux/mm.h   |  6 ++
 mm/page_alloc.c  | 30 ++
 3 files changed, 52 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index bae11c7e7bf3..bdd4268f9323 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -237,6 +237,22 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, 
unsigned long max_pfn);
for_each_mem_range_rev(i, , , \
   nid, flags, p_start, p_end, p_nid)
 
+/**
+ * for_each_resv_unavail_range - iterate through reserved and unavailable 
memory
+ * @i: u64 used as loop variable
+ * @flags: pick from blocks based on memory attributes
+ * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
+ * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
+ *
+ * Walks over unavailabled but reserved (reserved && !memory) areas of 
memblock.
+ * Available as soon as memblock is initialized.
+ * Note: because this memory does not belong to any physical node, flags and
+ * nid arguments do not make sense and thus not exported as arguments.
+ */
+#define for_each_resv_unavail_range(i, p_start, p_end) \
+   for_each_mem_range(i, , , \
+  NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
+
 static inline void memblock_set_region_flags(struct memblock_region *r,
 unsigned long flags)
 {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 183ac5e733db..0a440ff8f226 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1968,6 +1968,12 @@ extern int __meminit __early_pfn_to_nid(unsigned long 
pfn,
struct mminit_pfnnid_cache *state);
 #endif
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+void zero_resv_unavail(void);
+#else
+static inline void __paginginit zero_resv_unavail(void) {}
+#endif
+
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long,
unsigned long, enum memmap_context);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4d67fe3dd172..484c16fb5f0d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6261,6 +6261,34 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
free_area_init_core(pgdat);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+/*
+ * Only struct pages that are backed by physical memory are zeroed and
+ * initialized by going through __init_single_page(). But, there are some
+ * struct pages which are reserved in memblock allocator and their fields
+ * may be accessed (for example page_to_pfn() on some configuration accesses
+ * flags). We must explicitly zero those struct pages.
+ */
+void __paginginit zero_resv_unavail(void)
+{
+   phys_addr_t start, end;
+   unsigned long pfn;
+   u64 i, pgcnt;
+
+   /* Loop through ranges that are reserved, but do not have reported
+* physical memory backing.
+*/
+   pgcnt = 0;
+   for_each_resv_unavail_range(i, , ) {
+   for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
+   mm_zero_struct_page(pfn_to_page(pfn));
+   pgcnt++;
+   }
+   }
+   pr_info("Reserved but unavailable: %lld pages", pgcnt);
+}
+#endif /* CONFIG_HAVE_MEMBLOCK */
+
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 
 #if MAX_NUMNODES > 1
@@ -6684,6 +6712,7 @@ void __init free_area_init_nodes(unsigned long 
*max_zone_pfn)
node_set_state(nid, N_MEMORY);
check_for_memory(pgdat, nid);
}
+   zero_resv_unavail();
 }
 
 static int __init cmdline_parse_core(char *p, unsigned long *core)
@@ -6847,6 +6876,7 @@ void __init free_area_init(unsigned long *zones_size)
 {
free_area_init_node(0, zones_size,

[PATCH v7 08/11] mm: zero reserved and unavailable struct pages

2017-08-28 Thread Pavel Tatashin

Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not initialized by going
through __init_single_page().

In some cases these struct pages are accessed even if they do not contain
any data. One example is page_to_pfn() might access page->flags if this is
where section information is stored (CONFIG_SPARSEMEM,
SECTION_IN_PAGE_FLAGS).

Since, struct pages are zeroed in __init_single_page(), and not during
allocation time, we must zero such struct pages explicitly.

The patch involves adding a new memblock iterator:
for_each_resv_unavail_range(i, p_start, p_end)

Which iterates through reserved && !memory lists, and we zero struct pages
explicitly by calling mm_zero_struct_page().

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 include/linux/memblock.h | 16 
 include/linux/mm.h   |  6 ++
 mm/page_alloc.c  | 30 ++
 3 files changed, 52 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index bae11c7e7bf3..bdd4268f9323 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -237,6 +237,22 @@ unsigned long memblock_next_valid_pfn(unsigned long pfn, 
unsigned long max_pfn);
for_each_mem_range_rev(i, , , \
   nid, flags, p_start, p_end, p_nid)
 
+/**
+ * for_each_resv_unavail_range - iterate through reserved and unavailable 
memory
+ * @i: u64 used as loop variable
+ * @flags: pick from blocks based on memory attributes
+ * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
+ * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
+ *
+ * Walks over unavailabled but reserved (reserved && !memory) areas of 
memblock.
+ * Available as soon as memblock is initialized.
+ * Note: because this memory does not belong to any physical node, flags and
+ * nid arguments do not make sense and thus not exported as arguments.
+ */
+#define for_each_resv_unavail_range(i, p_start, p_end) \
+   for_each_mem_range(i, , , \
+  NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
+
 static inline void memblock_set_region_flags(struct memblock_region *r,
 unsigned long flags)
 {
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 183ac5e733db..0a440ff8f226 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1968,6 +1968,12 @@ extern int __meminit __early_pfn_to_nid(unsigned long 
pfn,
struct mminit_pfnnid_cache *state);
 #endif
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+void zero_resv_unavail(void);
+#else
+static inline void __paginginit zero_resv_unavail(void) {}
+#endif
+
 extern void set_dma_reserve(unsigned long new_dma_reserve);
 extern void memmap_init_zone(unsigned long, int, unsigned long,
unsigned long, enum memmap_context);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4d67fe3dd172..484c16fb5f0d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6261,6 +6261,34 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
free_area_init_core(pgdat);
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+/*
+ * Only struct pages that are backed by physical memory are zeroed and
+ * initialized by going through __init_single_page(). But, there are some
+ * struct pages which are reserved in memblock allocator and their fields
+ * may be accessed (for example page_to_pfn() on some configuration accesses
+ * flags). We must explicitly zero those struct pages.
+ */
+void __paginginit zero_resv_unavail(void)
+{
+   phys_addr_t start, end;
+   unsigned long pfn;
+   u64 i, pgcnt;
+
+   /* Loop through ranges that are reserved, but do not have reported
+* physical memory backing.
+*/
+   pgcnt = 0;
+   for_each_resv_unavail_range(i, , ) {
+   for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
+   mm_zero_struct_page(pfn_to_page(pfn));
+   pgcnt++;
+   }
+   }
+   pr_info("Reserved but unavailable: %lld pages", pgcnt);
+}
+#endif /* CONFIG_HAVE_MEMBLOCK */
+
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 
 #if MAX_NUMNODES > 1
@@ -6684,6 +6712,7 @@ void __init free_area_init_nodes(unsigned long 
*max_zone_pfn)
node_set_state(nid, N_MEMORY);
check_for_memory(pgdat, nid);
}
+   zero_resv_unavail();
 }
 
 static int __init cmdline_parse_core(char *p, unsigned long *core)
@@ -6847,6 +6876,7 @@ void __init free_area_init(unsigned long *zones_size)
 {
free_area_init_node(0, zones_size,
__pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL);
+   zero_resv_unavail();
 }
 
 static int

[PATCH v7 10/11] arm64/kasan: explicitly zero kasan shadow memory

2017-08-28 Thread Pavel Tatashin

To optimize the performance of struct page initialization,
vmemmap_populate() will no longer zero memory.

We must explicitly zero the memory that is allocated by vmemmap_populate()
for kasan, as this memory does not go through struct page initialization
path.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/arm64/mm/kasan_init.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index 81f03959a4ab..e78a9ecbb687 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -135,6 +135,41 @@ static void __init clear_pgds(unsigned long start,
set_pgd(pgd_offset_k(start), __pgd(0));
 }
 
+/*
+ * Memory that was allocated by vmemmap_populate is not zeroed, so we must
+ * zero it here explicitly.
+ */
+static void
+zero_vmemmap_populated_memory(void)
+{
+   struct memblock_region *reg;
+   u64 start, end;
+
+   for_each_memblock(memory, reg) {
+   start = __phys_to_virt(reg->base);
+   end = __phys_to_virt(reg->base + reg->size);
+
+   if (start >= end)
+   break;
+
+   start = (u64)kasan_mem_to_shadow((void *)start);
+   end = (u64)kasan_mem_to_shadow((void *)end);
+
+   /* Round to the start end of the mapped pages */
+   start = round_down(start, SWAPPER_BLOCK_SIZE);
+   end = round_up(end, SWAPPER_BLOCK_SIZE);
+   memset((void *)start, 0, end - start);
+   }
+
+   start = (u64)kasan_mem_to_shadow(_text);
+   end = (u64)kasan_mem_to_shadow(_end);
+
+   /* Round to the start end of the mapped pages */
+   start = round_down(start, SWAPPER_BLOCK_SIZE);
+   end = round_up(end, SWAPPER_BLOCK_SIZE);
+   memset((void *)start, 0, end - start);
+}
+
 void __init kasan_init(void)
 {
u64 kimg_shadow_start, kimg_shadow_end;
@@ -205,8 +240,15 @@ void __init kasan_init(void)
pfn_pte(sym_to_pfn(kasan_zero_page), PAGE_KERNEL_RO));
 
memset(kasan_zero_page, 0, PAGE_SIZE);
+
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
 
+   /*
+* vmemmap_populate does not zero the memory, so we need to zero it
+* explicitly
+*/
+   zero_vmemmap_populated_memory();
+
/* At this point kasan is fully initialized. Enable error messages */
init_task.kasan_depth = 0;
pr_info("KernelAddressSanitizer initialized\n");
-- 
2.14.1

[PATCH v7 02/11] sparc64/mm: setting fields in deferred pages

2017-08-28 Thread Pavel Tatashin

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to first
initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set some
fields prior to initializing:

mem_init() {
 register_page_bootmem_info();
 free_all_bootmem();
 ...
}

When register_page_bootmem_info() is called only non-deferred struct pages
are initialized. But, this function goes through some reserved pages which
might be part of the deferred, and thus are not yet initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() <- Only if this is deferred reserved page
__init_single_pfn()
 __init_single_page()
  memset(0) <-- Loose the set fields here

We end-up with similar issue as in the previous patch, where currently we
do not observe problem as memory is zeroed. But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page memory
during allocation, we must make sure that struct pages are properly
initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Signed-off-by: Pavel Tatashin 
Reviewed-by: Steven Sistare 
Reviewed-by: Daniel Jordan 
Reviewed-by: Bob Picco 
---
 arch/sparc/mm/init_64.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index b3020a956b87..12dbba85a2e2 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2508,9 +2508,15 @@ void __init mem_init(void)
 {
high_memory = __va(last_valid_pfn << PAGE_SHIFT);
 
-   register_page_bootmem_info();
free_all_bootmem();
 
+   /* Must be done after boot memory is put on freelist, because here we
+* might set fields in deferred struct pages that have not yet been
+* initialized, and free_all_bootmem() initializes all the reserved
+* deferred pages for us.
+*/
+   register_page_bootmem_info();
+
/*
 * Set up the zero page, mark it reserved, so that page count
 * is not manipulated when freeing the page from user ptes.
-- 
2.14.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2328 matches

Mail list logo