Re: [PATCH 3/3] nvme: complete request in work queue on CPU with flooded interrupts

2019-08-20 Thread Sagi Grimberg




From: Long Li 

When a NVMe hardware queue is mapped to several CPU queues, it is possible
that the CPU this hardware queue is bound to is flooded by returning I/O for
other CPUs.

For example, consider the following scenario:
1. CPU 0, 1, 2 and 3 share the same hardware queue
2. the hardware queue interrupts CPU 0 for I/O response
3. processes from CPU 1, 2 and 3 keep sending I/Os

CPU 0 may be flooded with interrupts from NVMe device that are I/O responses
for CPU 1, 2 and 3. Under heavy I/O load, it is possible that CPU 0 spends
all the time serving NVMe and other system interrupts, but doesn't have a
chance to run in process context.

To fix this, CPU 0 can schedule a work to complete the I/O request when it
detects the scheduler is not making progress. This serves multiple purposes:

1. This CPU has to be scheduled to complete the request. The other CPUs can't
issue more I/Os until some previous I/Os are completed. This helps this CPU
get out of NVMe interrupts.

2. This acts a throttling mechanisum for NVMe devices, in that it can not
starve a CPU while servicing I/Os from other CPUs.

3. This CPU can make progress on RCU and other work items on its queue.


The problem is indeed real, but this is the wrong approach in my mind.

We already have irqpoll which takes care proper budgeting polling
cycles and not hogging the cpu.

I've sent rfc for this particular problem before [1]. At the time IIRC,
Christoph suggested that we will poll the first batch directly from
the irq context and reap the rest in irqpoll handler.

[1]: 
http://lists.infradead.org/pipermail/linux-nvme/2016-October/006497.html


How about something like this instead:
--
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 71127a366d3c..84bf16d75109 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "trace.h"
 #include "nvme.h"
@@ -32,6 +33,7 @@
 #define CQ_SIZE(q) ((q)->q_depth * sizeof(struct nvme_completion))

 #define SGES_PER_PAGE  (PAGE_SIZE / sizeof(struct nvme_sgl_desc))
+#define NVME_POLL_BUDGET_IRQ   256

 /*
  * These can be higher, but we need to ensure that any command doesn't
@@ -189,6 +191,7 @@ struct nvme_queue {
u32 *dbbuf_cq_db;
u32 *dbbuf_sq_ei;
u32 *dbbuf_cq_ei;
+   struct irq_poll iop;
struct completion delete_done;
 };

@@ -1015,6 +1018,23 @@ static inline int nvme_process_cq(struct 
nvme_queue *nvmeq, u16 *start,

return found;
 }

+static int nvme_irqpoll_handler(struct irq_poll *iop, int budget)
+{
+   struct nvme_queue *nvmeq = container_of(iop, struct nvme_queue, 
iop);

+   struct pci_dev *pdev = to_pci_dev(nvmeq->dev->dev);
+   u16 start, end;
+   int completed;
+
+   completed = nvme_process_cq(nvmeq, , , budget);
+   nvme_complete_cqes(nvmeq, start, end);
+   if (completed < budget) {
+   irq_poll_complete(>iop);
+   enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
+   }
+
+   return completed;
+}
+
 static irqreturn_t nvme_irq(int irq, void *data)
 {
struct nvme_queue *nvmeq = data;
@@ -1028,12 +1048,16 @@ static irqreturn_t nvme_irq(int irq, void *data)
rmb();
if (nvmeq->cq_head != nvmeq->last_cq_head)
ret = IRQ_HANDLED;
-   nvme_process_cq(nvmeq, , , -1);
+   nvme_process_cq(nvmeq, , , NVME_POLL_BUDGET_IRQ);
nvmeq->last_cq_head = nvmeq->cq_head;
wmb();

if (start != end) {
nvme_complete_cqes(nvmeq, start, end);
+   if (nvme_cqe_pending(nvmeq)) {
+   disable_irq_nosync(irq);
+   irq_poll_sched(>iop);
+   }
return IRQ_HANDLED;
}

@@ -1347,6 +1371,7 @@ static enum blk_eh_timer_return 
nvme_timeout(struct request *req, bool reserved)


 static void nvme_free_queue(struct nvme_queue *nvmeq)
 {
+   irq_poll_disable(>iop);
dma_free_coherent(nvmeq->dev->dev, CQ_SIZE(nvmeq),
(void *)nvmeq->cqes, nvmeq->cq_dma_addr);
if (!nvmeq->sq_cmds)
@@ -1481,6 +1506,7 @@ static int nvme_alloc_queue(struct nvme_dev *dev, 
int qid, int depth)

nvmeq->dev = dev;
spin_lock_init(>sq_lock);
spin_lock_init(>cq_poll_lock);
+   irq_poll_init(>iop, NVME_POLL_BUDGET_IRQ, 
nvme_irqpoll_handler);

nvmeq->cq_head = 0;
nvmeq->cq_phase = 1;
nvmeq->q_db = >dbs[qid * 2 * dev->db_stride];
--


Re: [PATCH] infiniband: hfi1: fix memory leaks

2019-08-20 Thread Doug Ledford
On Sun, 2019-08-18 at 13:54 -0500, Wenwen Wang wrote:
> In fault_opcodes_write(), 'data' is allocated through kcalloc().
> However,
> it is not deallocated in the following execution if an error occurs,
> leading to memory leaks. To fix this issue, introduce the 'free_data'
> label
> to free 'data' before returning the error.
> 
> Signed-off-by: Wenwen Wang 

Applied to for-rc, thanks.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


Re: [PATCH 3/3] mm/mmap.c: extract __vma_unlink_list as counter part for __vma_link_list

2019-08-20 Thread Matthew Wilcox
On Wed, Aug 14, 2019 at 11:19:37AM +0200, Vlastimil Babka wrote:
> On 8/14/19 8:57 AM, Wei Yang wrote:
> > On Tue, Aug 13, 2019 at 10:16:11PM -0700, Christoph Hellwig wrote:
> >>Btw, is there any good reason we don't use a list_head for vma linkage?
> > 
> > Not sure, maybe there is some historical reason?
> 
> Seems it was single-linked until 2010 commit 297c5eee3724 ("mm: make the vma
> list be doubly linked") and I guess it was just simpler to add the vm_prev 
> link.
> 
> Conversion to list_head might be an interesting project for some "advanced
> beginner" in the kernel :)

I'm working to get rid of vm_prev and vm_next, so it would probably be
wasted effort.


Re: [PATCH] infiniband: hfi1: fix a memory leak bug

2019-08-20 Thread Doug Ledford
On Sun, 2019-08-18 at 14:29 -0500, Wenwen Wang wrote:
> In fault_opcodes_read(), 'data' is not deallocated if
> debugfs_file_get()
> fails, leading to a memory leak. To fix this bug, introduce the
> 'free_data'
> label to free 'data' before returning the error.
> 
> Signed-off-by: Wenwen Wang 

Applied to for-rc, thanks.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


[PATCH v3 2/8] arm64: dts: qcom: pm8150: Add base dts file

2019-08-20 Thread Vinod Koul
Add base DTS file for pm8150 along with GPIOs, power-on, rtc and vadc
nodes

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/pm8150.dtsi | 97 
 1 file changed, 97 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150.dtsi

diff --git a/arch/arm64/boot/dts/qcom/pm8150.dtsi 
b/arch/arm64/boot/dts/qcom/pm8150.dtsi
new file mode 100644
index ..b6e304748a57
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/pm8150.dtsi
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+_bus {
+   pm8150_0: pmic@0 {
+   compatible = "qcom,pm8150", "qcom,spmi-pmic";
+   reg = <0x0 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   pon: power-on@800 {
+   compatible = "qcom,pm8916-pon";
+   reg = <0x0800>;
+   pwrkey {
+   compatible = "qcom,pm8941-pwrkey";
+   interrupts = <0x0 0x8 0x0 IRQ_TYPE_EDGE_BOTH>;
+   debounce = <15625>;
+   bias-pull-up;
+   linux,code = ;
+
+   status = "disabled";
+   };
+   };
+
+   pm8150_adc: adc@3100 {
+   compatible = "qcom,spmi-adc5";
+   reg = <0x3100>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   #io-channel-cells = <1>;
+   interrupts = <0x0 0x31 0x0 IRQ_TYPE_EDGE_RISING>;
+
+   status = "disabled";
+
+   ref-gnd@0 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "ref_gnd";
+   };
+
+   vref-1p25@1 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "vref_1p25";
+   };
+
+   die-temp@6 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "die_temp";
+   };
+   };
+
+   rtc@6000 {
+   compatible = "qcom,pm8941-rtc";
+   reg = <0x6000>;
+   reg-names = "rtc", "alarm";
+   interrupts = <0x0 0x61 0x1 IRQ_TYPE_NONE>;
+
+   status = "disabled";
+   };
+
+   pm8150_gpios: gpio@c000 {
+   compatible = "qcom,pm8150-gpio";
+   reg = <0xc000>;
+   gpio-controller;
+   #gpio-cells = <2>;
+   interrupts = <0x0 0xc0 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc1 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc2 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc3 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc4 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc5 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc6 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc7 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc8 0x0 IRQ_TYPE_NONE>,
+<0x0 0xc9 0x0 IRQ_TYPE_NONE>,
+<0x0 0xca 0x0 IRQ_TYPE_NONE>,
+<0x0 0xcb 0x0 IRQ_TYPE_NONE>;
+   };
+   };
+
+   pmic@1 {
+   compatible = "qcom,pm8150", "qcom,spmi-pmic";
+   reg = <0x1 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+};
-- 
2.20.1



[PATCH v3 5/8] arm64: dts: qcom: sm8150-mtp: Add base dts file

2019-08-20 Thread Vinod Koul
This add base DTS file for sm8150-mtp and enables boot to console, adds
tlmm reserved range, resin node, volume down key and also includes pmic
file.

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/Makefile   |  1 +
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 51 +
 2 files changed, 52 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/sm8150-mtp.dts

diff --git a/arch/arm64/boot/dts/qcom/Makefile 
b/arch/arm64/boot/dts/qcom/Makefile
index 0a7e5dfce6f7..1964dacaf19b 100644
--- a/arch/arm64/boot/dts/qcom/Makefile
+++ b/arch/arm64/boot/dts/qcom/Makefile
@@ -12,5 +12,6 @@ dtb-$(CONFIG_ARCH_QCOM)   += sdm845-cheza-r2.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-cheza-r3.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-db845c.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-mtp.dtb
+dtb-$(CONFIG_ARCH_QCOM)+= sm8150-mtp.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= qcs404-evb-1000.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= qcs404-evb-4000.dtb
diff --git a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts 
b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
new file mode 100644
index ..6f5777f530ae
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+/dts-v1/;
+
+#include "sm8150.dtsi"
+#include "pm8150.dtsi"
+#include "pm8150b.dtsi"
+#include "pm8150l.dtsi"
+
+/ {
+   model = "Qualcomm Technologies, Inc. SM8150 MTP";
+   compatible = "qcom,sm8150-mtp";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+   };
+};
+
+_id_1 {
+   status = "okay";
+};
+
+ {
+   pwrkey {
+   status = "okay";
+   };
+
+   resin {
+   compatible = "qcom,pm8941-resin";
+   interrupts = <0x0 0x8 0x1 IRQ_TYPE_EDGE_BOTH>;
+   debounce = <15625>;
+   bias-pull-up;
+   linux,code = ;
+   };
+};
+
+ {
+   gpio-reserved-ranges = <0 4>, <126 4>;
+};
+
+ {
+   status = "okay";
+};
-- 
2.20.1



[PATCH v3 7/8] arm64: dts: qcom: sm8150: Add reserved-memory regions

2019-08-20 Thread Vinod Koul
Add the reserved memory regions in SM8150

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/sm8150.dtsi | 111 +++
 1 file changed, 111 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi 
b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index ba5a9f6332c1..3bed04d60dea 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -155,6 +155,117 @@
method = "smc";
};
 
+   reserved-memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   hyp_mem: memory@8570 {
+   reg = <0x0 0x8570 0x0 0x60>;
+   no-map;
+   };
+
+   xbl_mem: memory@85d0 {
+   reg = <0x0 0x85d0 0x0 0x14>;
+   no-map;
+   };
+
+   aop_mem: memory@85f0 {
+   reg = <0x0 0x85f0 0x0 0x2>;
+   no-map;
+   };
+
+   aop_cmd_db: memory@85f2 {
+   compatible = "qcom,cmd-db";
+   reg = <0x0 0x85f2 0x0 0x2>;
+   no-map;
+   };
+
+   smem_mem: memory@8600 {
+   reg = <0x0 0x8600 0x0 0x20>;
+   no-map;
+   };
+
+   tz_mem: memory@8620 {
+   reg = <0x0 0x8620 0x0 0x390>;
+   no-map;
+   };
+
+   rmtfs_mem: memory@89b0 {
+   compatible = "qcom,rmtfs-mem";
+   reg = <0x0 0x89b0 0x0 0x20>;
+   no-map;
+
+   qcom,client-id = <1>;
+   qcom,vmid = <15>;
+   };
+
+   camera_mem: memory@8b70 {
+   reg = <0x0 0x8b70 0x0 0x50>;
+   no-map;
+   };
+
+   wlan_mem: memory@8bc0 {
+   reg = <0x0 0x8bc0 0x0 0x18>;
+   no-map;
+   };
+
+   npu_mem: memory@8bd8 {
+   reg = <0x0 0x8bd8 0x0 0x8>;
+   no-map;
+   };
+
+   adsp_mem: memory@8be0 {
+   reg = <0x0 0x8be0 0x0 0x1a0>;
+   no-map;
+   };
+
+   mpss_mem: memory@8d80 {
+   reg = <0x0 0x8d80 0x0 0x960>;
+   no-map;
+   };
+
+   venus_mem: memory@96e0 {
+   reg = <0x0 0x96e0 0x0 0x50>;
+   no-map;
+   };
+
+   slpi_mem: memory@9730 {
+   reg = <0x0 0x9730 0x0 0x140>;
+   no-map;
+   };
+
+   ipa_fw_mem: memory@9870 {
+   reg = <0x0 0x9870 0x0 0x1>;
+   no-map;
+   };
+
+   ipa_gsi_mem: memory@9871 {
+   reg = <0x0 0x9871 0x0 0x5000>;
+   no-map;
+   };
+
+   gpu_mem: memory@98715000 {
+   reg = <0x0 0x98715000 0x0 0x2000>;
+   no-map;
+   };
+
+   spss_mem: memory@9880 {
+   reg = <0x0 0x9880 0x0 0x10>;
+   no-map;
+   };
+
+   cdsp_mem: memory@9890 {
+   reg = <0x0 0x9890 0x0 0x140>;
+   no-map;
+   };
+
+   qseecom_mem: memory@9e40 {
+   reg = <0x0 0x9e40 0x0 0x140>;
+   no-map;
+   };
+   };
+
soc: soc@0 {
#address-cells = <1>;
#size-cells = <1>;
-- 
2.20.1



[PATCH v3 4/8] arm64: dts: qcom: pm8150l: Add base dts file

2019-08-20 Thread Vinod Koul
PMIC pm8150l is a slave pmic and this adds base DTS file for pm8150l
with power-on, adc and gpio nodes

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/pm8150l.dtsi | 80 +++
 1 file changed, 80 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150l.dtsi

diff --git a/arch/arm64/boot/dts/qcom/pm8150l.dtsi 
b/arch/arm64/boot/dts/qcom/pm8150l.dtsi
new file mode 100644
index ..eb0e9a090e42
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/pm8150l.dtsi
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+#include 
+#include 
+#include 
+
+_bus {
+   pmic@4 {
+   compatible = "qcom,pm8150l", "qcom,spmi-pmic";
+   reg = <0x4 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   power-on@800 {
+   compatible = "qcom,pm8916-pon";
+   reg = <0x0800>;
+
+   status = "disabled";
+   };
+
+   adc@3100 {
+   compatible = "qcom,spmi-adc5";
+   reg = <0x3100>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   #io-channel-cells = <1>;
+   interrupts = <0x4 0x31 0x0 IRQ_TYPE_EDGE_RISING>;
+
+   status = "disabled";
+
+   ref-gnd@0 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "ref_gnd";
+   };
+
+   vref-1p25@1 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "vref_1p25";
+   };
+
+   die-temp@6 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "die_temp";
+   };
+   };
+
+   pm8150l_gpios: gpio@c000 {
+   compatible = "qcom,pm8150l-gpio";
+   reg = <0xc000>;
+   gpio-controller;
+   #gpio-cells = <2>;
+   interrupts = <0x4 0xc0 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc1 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc2 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc3 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc4 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc5 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc6 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc7 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc8 0x0 IRQ_TYPE_NONE>,
+<0x4 0xc9 0x0 IRQ_TYPE_NONE>,
+<0x4 0xca 0x0 IRQ_TYPE_NONE>,
+<0x4 0xcb 0x0 IRQ_TYPE_NONE>;
+   };
+   };
+
+   pmic@5 {
+   compatible = "qcom,pm8150l", "qcom,spmi-pmic";
+   reg = <0x5 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+};
-- 
2.20.1



[PATCH v3 6/8] arm64: dts: qcom: sm8150-mtp: Add regulators

2019-08-20 Thread Vinod Koul
Add the regulators found in the mtp platform. This platform consists of
pmic PM8150, PM8150L and PM8009.

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 327 
 1 file changed, 327 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts 
b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
index 6f5777f530ae..340d57cc62bf 100644
--- a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
+++ b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
@@ -6,6 +6,7 @@
 
 /dts-v1/;
 
+#include 
 #include "sm8150.dtsi"
 #include "pm8150.dtsi"
 #include "pm8150b.dtsi"
@@ -22,6 +23,332 @@
chosen {
stdout-path = "serial0:115200n8";
};
+
+   vph_pwr: vph-pwr-regulator {
+   compatible = "regulator-fixed";
+   regulator-name = "vph_pwr";
+   regulator-min-microvolt = <370>;
+   regulator-max-microvolt = <370>;
+   };
+
+   /*
+* Apparently RPMh does not provide support for PM8150 S4 because it
+* is always-on; model it as a fixed regulator.
+*/
+   vreg_s4a_1p8: pm8150-s4 {
+   compatible = "regulator-fixed";
+   regulator-name = "vreg_s4a_1p8";
+
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+
+   regulator-always-on;
+   regulator-boot-on;
+
+   vin-supply = <_pwr>;
+   };
+};
+
+_rsc {
+   pm8150-rpmh-regulators {
+   compatible = "qcom,pm8150-rpmh-regulators";
+   qcom,pmic-id = "a";
+
+   vdd-s1-supply = <_pwr>;
+   vdd-s2-supply = <_pwr>;
+   vdd-s3-supply = <_pwr>;
+   vdd-s4-supply = <_pwr>;
+   vdd-s5-supply = <_pwr>;
+   vdd-s6-supply = <_pwr>;
+   vdd-s7-supply = <_pwr>;
+   vdd-s8-supply = <_pwr>;
+   vdd-s9-supply = <_pwr>;
+   vdd-s10-supply = <_pwr>;
+
+   vdd-l1-l8-l11-supply = <_s6a_0p9>;
+   vdd-l2-l10-supply = <_bob>;
+   vdd-l3-l4-l5-l18-supply = <_s6a_0p9>;
+   vdd-l6-l9-supply = <_s8c_1p3>;
+   vdd-l7-l12-l14-l15-supply = <_s5a_2p0>;
+   vdd-l13-l16-l17-supply = <_bob>;
+
+   vreg_s5a_2p0: smps5 {
+   regulator-min-microvolt = <1904000>;
+   regulator-max-microvolt = <200>;
+   };
+
+   vreg_s6a_0p9: smps6 {
+   regulator-min-microvolt = <92>;
+   regulator-max-microvolt = <1128000>;
+   };
+
+   vdda_wcss_pll:
+   vreg_l1a_0p75: ldo1 {
+   regulator-min-microvolt = <752000>;
+   regulator-max-microvolt = <752000>;
+   regulator-initial-mode = ;
+   };
+
+   vdd_pdphy:
+   vdda_usb_hs_3p1:
+   vreg_l2a_3p1: ldo2 {
+   regulator-min-microvolt = <3072000>;
+   regulator-max-microvolt = <3072000>;
+   regulator-initial-mode = ;
+   };
+
+   vreg_l3a_0p8: ldo3 {
+   regulator-min-microvolt = <48>;
+   regulator-max-microvolt = <932000>;
+   regulator-initial-mode = ;
+   };
+
+   vdd_usb_hs_core:
+   vdda_csi_0_0p9:
+   vdda_csi_1_0p9:
+   vdda_csi_2_0p9:
+   vdda_csi_3_0p9:
+   vdda_dsi_0_0p9:
+   vdda_dsi_1_0p9:
+   vdda_dsi_0_pll_0p9:
+   vdda_dsi_1_pll_0p9:
+   vdda_pcie_1ln_core:
+   vdda_pcie_2ln_core:
+   vdda_pll_hv_cc_ebi01:
+   vdda_pll_hv_cc_ebi23:
+   vdda_qrefs_0p875_5:
+   vdda_sp_sensor:
+   vdda_ufs_2ln_core_1:
+   vdda_ufs_2ln_core_2:
+   vdda_usb_ss_dp_core_1:
+   vdda_usb_ss_dp_core_2:
+   vdda_qlink_lv:
+   vdda_qlink_lv_ck:
+   vreg_l5a_0p875: ldo5 {
+   regulator-min-microvolt = <88>;
+   regulator-max-microvolt = <88>;
+   regulator-initial-mode = ;
+   };
+
+   vreg_l6a_1p2: ldo6 {
+   regulator-min-microvolt = <120>;
+   regulator-max-microvolt = <120>;
+   regulator-initial-mode = ;
+   };
+
+   vreg_l7a_1p8: ldo7 {
+   regulator-min-microvolt = <180>;
+   regulator-max-microvolt = <180>;
+   regulator-initial-mode = ;
+   };
+
+   vddpx_10:
+   vreg_l9a_1p2: ldo9 {
+   regulator-min-microvolt = <120>;
+   

[PATCH v3 8/8] arm64: dts: qcom: sm8150: Add apps shared nodes

2019-08-20 Thread Vinod Koul
Add hwlock, pmu, smem, tcsr_mutex_regs, apss_shared mailbox, apps_rsc
including the rpmhcc child nodes to the SM8150 DTSI

Co-developed-by: Sibi Sankar 
Signed-off-by: Sibi Sankar 
Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/sm8150.dtsi | 63 
 1 file changed, 63 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi 
b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index 3bed04d60dea..781905e9977a 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -144,12 +144,23 @@
};
};
 
+   tcsr_mutex: hwlock {
+   compatible = "qcom,tcsr-mutex";
+   syscon = <_mutex_regs 0 0x1000>;
+   #hwlock-cells = <1>;
+   };
+
memory@8000 {
device_type = "memory";
/* We expect the bootloader to fill in the size */
reg = <0x0 0x8000 0x0 0x0>;
};
 
+   pmu {
+   compatible = "arm,armv8-pmuv3";
+   interrupts = ;
+   };
+
psci {
compatible = "arm,psci-1.0";
method = "smc";
@@ -266,6 +277,12 @@
};
};
 
+   smem {
+   compatible = "qcom,smem";
+   memory-region = <_mem>;
+   hwlocks = <_mutex 3>;
+   };
+
soc: soc@0 {
#address-cells = <1>;
#size-cells = <1>;
@@ -305,6 +322,11 @@
};
};
 
+   tcsr_mutex_regs: syscon@1f4 {
+   compatible = "syscon";
+   reg = <0x01f4 0x4>;
+   };
+
tlmm: pinctrl@310 {
compatible = "qcom,sm8150-pinctrl";
reg = <0x0310 0x30>,
@@ -320,6 +342,16 @@
#interrupt-cells = <2>;
};
 
+   aoss_qmp: power-controller@c30 {
+   compatible = "qcom,sm8150-aoss-qmp";
+   reg = <0x0c30 0x10>;
+   interrupts = ;
+   mboxes = <_shared 0>;
+
+   #clock-cells = <0>;
+   #power-domain-cells = <1>;
+   };
+
intc: interrupt-controller@17a0 {
compatible = "arm,gic-v3";
interrupt-controller;
@@ -329,6 +361,12 @@
interrupts = ;
};
 
+   apss_shared: mailbox@17c0 {
+   compatible = "qcom,sm8150-apss-shared";
+   reg = <0x17c0 0x1000>;
+   #mbox-cells = <1>;
+   };
+
timer@17c2 {
#address-cells = <1>;
#size-cells = <1>;
@@ -388,6 +426,31 @@
};
};
 
+   apps_rsc: rsc@1820 {
+   label = "apps_rsc";
+   compatible = "qcom,rpmh-rsc";
+   reg = <0x1820 0x1>,
+ <0x1821 0x1>,
+ <0x1822 0x1>;
+   reg-names = "drv-0", "drv-1", "drv-2";
+   interrupts = ,
+,
+;
+   qcom,tcs-offset = <0xd00>;
+   qcom,drv-id = <2>;
+   qcom,tcs-config = ,
+ ,
+ ,
+ ;
+
+   rpmhcc: clock-controller {
+   compatible = "qcom,sm8150-rpmh-clk";
+   #clock-cells = <1>;
+   clock-names = "xo";
+   clocks = <_board>;
+   };
+   };
+
spmi_bus: spmi@c44 {
compatible = "qcom,spmi-pmic-arb";
reg = <0x0c44 0x0001100>,
-- 
2.20.1



[PATCH v3 5/8] arm64: dts: qcom: sm8150-mtp: add base dts file

2019-08-20 Thread Vinod Koul
This add base DTS file for sm8150-mtp and enables boot to console, adds
tlmm reserved range, resin node, volume down key and also includes pmic
file.

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/Makefile   |  1 +
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 51 +
 2 files changed, 52 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/sm8150-mtp.dts

diff --git a/arch/arm64/boot/dts/qcom/Makefile 
b/arch/arm64/boot/dts/qcom/Makefile
index 0a7e5dfce6f7..1964dacaf19b 100644
--- a/arch/arm64/boot/dts/qcom/Makefile
+++ b/arch/arm64/boot/dts/qcom/Makefile
@@ -12,5 +12,6 @@ dtb-$(CONFIG_ARCH_QCOM)   += sdm845-cheza-r2.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-cheza-r3.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-db845c.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= sdm845-mtp.dtb
+dtb-$(CONFIG_ARCH_QCOM)+= sm8150-mtp.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= qcs404-evb-1000.dtb
 dtb-$(CONFIG_ARCH_QCOM)+= qcs404-evb-4000.dtb
diff --git a/arch/arm64/boot/dts/qcom/sm8150-mtp.dts 
b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
new file mode 100644
index ..6f5777f530ae
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/sm8150-mtp.dts
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+/dts-v1/;
+
+#include "sm8150.dtsi"
+#include "pm8150.dtsi"
+#include "pm8150b.dtsi"
+#include "pm8150l.dtsi"
+
+/ {
+   model = "Qualcomm Technologies, Inc. SM8150 MTP";
+   compatible = "qcom,sm8150-mtp";
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+   };
+};
+
+_id_1 {
+   status = "okay";
+};
+
+ {
+   pwrkey {
+   status = "okay";
+   };
+
+   resin {
+   compatible = "qcom,pm8941-resin";
+   interrupts = <0x0 0x8 0x1 IRQ_TYPE_EDGE_BOTH>;
+   debounce = <15625>;
+   bias-pull-up;
+   linux,code = ;
+   };
+};
+
+ {
+   gpio-reserved-ranges = <0 4>, <126 4>;
+};
+
+ {
+   status = "okay";
+};
-- 
2.20.1



[PATCH v3 3/8] arm64: dts: qcom: pm8150b: Add base dts file

2019-08-20 Thread Vinod Koul
PMIC pm8150b is a slave pmic and this adds base DTS file for pm8150b
with power-on, adc, and gpio nodes

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/pm8150b.dtsi | 86 +++
 1 file changed, 86 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150b.dtsi

diff --git a/arch/arm64/boot/dts/qcom/pm8150b.dtsi 
b/arch/arm64/boot/dts/qcom/pm8150b.dtsi
new file mode 100644
index ..322379d5c31f
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/pm8150b.dtsi
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+#include 
+#include 
+#include 
+
+_bus {
+   pmic@2 {
+   compatible = "qcom,pm8150b", "qcom,spmi-pmic";
+   reg = <0x2 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   power-on@800 {
+   compatible = "qcom,pm8916-pon";
+   reg = <0x0800>;
+
+   status = "disabled";
+   };
+
+   adc@3100 {
+   compatible = "qcom,spmi-adc5";
+   reg = <0x3100>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   #io-channel-cells = <1>;
+   interrupts = <0x2 0x31 0x0 IRQ_TYPE_EDGE_RISING>;
+
+   status = "disabled";
+
+   ref-gnd@0 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "ref_gnd";
+   };
+
+   vref-1p25@1 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "vref_1p25";
+   };
+
+   die-temp@6 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "die_temp";
+   };
+
+   chg-temp@9 {
+   reg = ;
+   qcom,pre-scaling = <1 1>;
+   label = "chg_temp";
+   };
+   };
+
+   pm8150b_gpios: gpio@c000 {
+   compatible = "qcom,pm8150b-gpio";
+   reg = <0xc000>;
+   gpio-controller;
+   #gpio-cells = <2>;
+   interrupts = <0x2 0xc0 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc1 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc2 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc3 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc4 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc5 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc6 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc7 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc8 0x0 IRQ_TYPE_NONE>,
+<0x2 0xc9 0x0 IRQ_TYPE_NONE>,
+<0x2 0xca 0x0 IRQ_TYPE_NONE>,
+<0x2 0xcb 0x0 IRQ_TYPE_NONE>;
+   };
+   };
+
+   pmic@3 {
+   compatible = "qcom,pm8150b", "qcom,spmi-pmic";
+   reg = <0x3 SPMI_USID>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+};
-- 
2.20.1



[PATCH v3 1/8] arm64: dts: qcom: sm8150: Add base dts file

2019-08-20 Thread Vinod Koul
This add base DTS file with cpu, psci, firmware, clock node tlmm and
spmi and enables boot to console

Signed-off-by: Vinod Koul 
---
 arch/arm64/boot/dts/qcom/sm8150.dtsi | 307 +++
 1 file changed, 307 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/sm8150.dtsi

diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi 
b/arch/arm64/boot/dts/qcom/sm8150.dtsi
new file mode 100644
index ..ba5a9f6332c1
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -0,0 +1,307 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2019, Linaro Limited
+ */
+
+#include 
+#include 
+#include 
+
+/ {
+   interrupt-parent = <>;
+
+   #address-cells = <2>;
+   #size-cells = <2>;
+
+   chosen { };
+
+   clocks {
+   xo_board: xo-board {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <3840>;
+   clock-output-names = "xo_board";
+   };
+
+   sleep_clk: sleep-clk {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <32764>;
+   clock-output-names = "sleep_clk";
+   };
+   };
+
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   CPU0: cpu@0 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x0>;
+   enable-method = "psci";
+   next-level-cache = <_0>;
+   L2_0: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   L3_0: l3-cache {
+ compatible = "cache";
+   };
+   };
+   };
+
+   CPU1: cpu@100 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x100>;
+   enable-method = "psci";
+   next-level-cache = <_100>;
+   L2_100: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+
+   };
+
+   CPU2: cpu@200 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x200>;
+   enable-method = "psci";
+   next-level-cache = <_200>;
+   L2_200: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+   };
+
+   CPU3: cpu@300 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x300>;
+   enable-method = "psci";
+   next-level-cache = <_300>;
+   L2_300: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+   };
+
+   CPU4: cpu@400 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x400>;
+   enable-method = "psci";
+   next-level-cache = <_400>;
+   L2_400: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+   };
+
+   CPU5: cpu@500 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x500>;
+   enable-method = "psci";
+   next-level-cache = <_500>;
+   L2_500: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+   };
+
+   CPU6: cpu@600 {
+   device_type = "cpu";
+   compatible = "qcom,kryo485";
+   reg = <0x0 0x600>;
+   enable-method = "psci";
+   next-level-cache = <_600>;
+   L2_600: l2-cache {
+   compatible = "cache";
+   next-level-cache = <_0>;
+   };
+   };
+
+   CPU7: cpu@700 {
+   

[PATCH v3 0/8] arm64: dts: qcom: sm8150: Add SM8150 DTS

2019-08-20 Thread Vinod Koul
This series adds DTS for SM8150, PMIC PM8150, PM8150B, PM8150L and
the MTP for SM8150.

Changes in v3:
 - Fix copyright comment style to Linux kernel style
 - Make property values all hex or decimal
 - Fix patch titles and logs and make them consistent
 - Fix line breaks

Changes in v2:
 - Squash patches
 - Fix comments given by Stephen namely, lowercase for hext numbers,
   making rpmhcc have xo_board as parent, rename pon controller to
   power-on controller, make pmic nodes as disabled etc.
 - removed the dependency on clk defines and use raw numbers


Vinod Koul (8):
  arm64: dts: qcom: sm8150: Add base dts file
  arm64: dts: qcom: pm8150: Add base dts file
  arm64: dts: qcom: pm8150b: Add base dts file
  arm64: dts: qcom: pm8150l: Add base dts file
  arm64: dts: qcom: sm8150-mtp: Add base dts file
  arm64: dts: qcom: sm8150-mtp: Add regulators
  arm64: dts: qcom: sm8150: Add reserved-memory regions
  arm64: dts: qcom: sm8150: Add apps shared nodes

 arch/arm64/boot/dts/qcom/Makefile   |   1 +
 arch/arm64/boot/dts/qcom/pm8150.dtsi|  97 +
 arch/arm64/boot/dts/qcom/pm8150b.dtsi   |  86 +
 arch/arm64/boot/dts/qcom/pm8150l.dtsi   |  80 
 arch/arm64/boot/dts/qcom/sm8150-mtp.dts | 378 +++
 arch/arm64/boot/dts/qcom/sm8150.dtsi| 481 
 6 files changed, 1123 insertions(+)
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150.dtsi
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150b.dtsi
 create mode 100644 arch/arm64/boot/dts/qcom/pm8150l.dtsi
 create mode 100644 arch/arm64/boot/dts/qcom/sm8150-mtp.dts
 create mode 100644 arch/arm64/boot/dts/qcom/sm8150.dtsi

-- 
2.20.1



Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Alex Williamson
On Tue, 20 Aug 2019 08:58:02 +
Parav Pandit  wrote:

> + Dave.
> 
> Hi Jiri, Dave, Alex, Kirti, Cornelia,
> 
> Please provide your feedback on it, how shall we proceed?
> 
> Short summary of requirements.
> For a given mdev (mediated device [1]), there is one representor
> netdevice and devlink port in switchdev mode (similar to SR-IOV VF),
> And there is one netdevice for the actual mdev when mdev is probed.
> 
> (a) representor netdev and devlink port should be able derive
> phys_port_name(). So that representor netdev name can be built
> deterministically across reboots.
> 
> (b) for mdev's netdevice, mdev's device should have an attribute.
> This attribute can be used by udev rules/systemd or something else to
> rename netdev name deterministically.
> 
> (c) IFNAMSIZ of 16 bytes is too small to fit whole UUID.
> A simple grep IFNAMSIZ in stack hints hundreds of users of IFNAMSIZ
> in drivers, uapi, netlink, boot config area and more. Changing
> IFNAMSIZ for a mdev bus doesn't really look reasonable option to me.

How many characters do we really have to work with?  Your examples
below prepend various characters, ex. option-1 results in ens2f0_m10 or
enm10.  Do the extra 8 or 3 characters in these count against IFNAMSIZ?

> Hence, I would like to discuss below options.
> 
> Option-1: mdev index
> Introduce an optional mdev index/handle as u32 during mdev create
> time. User passes mdev index/handle as input.
> 
> phys_port_name=mIndex=m%u
> mdev_index will be available in sysfs as mdev attribute for udev to
> name the mdev's netdev.
> 
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID index=10
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create

Nit, IIRC previous discussions of additional parameters used comma
separators, ex. echo $UUID,index=10 >...

> > example netdevs:
> repnetdev=ens2f0_m10  /*ens2f0 is parent PF's netdevice */

Is the parent really relevant in the name?  Tools like mdevctl are
meant to provide persistence, creating the same mdev devices on the
same parent, but that's simply the easiest policy decision.  We can
also imagine that multiple parent devices might support a specified
mdev type and policies factoring in proximity, load-balancing, power
consumption, etc might be weighed such that we really don't want to
promote userspace creating dependencies on the parent association.

> mdev_netdev=enm10
> 
> Pros:
> 1. mdevctl and any other existing tools are unaffected.
> 2. netdev stack, ovs and other switching platforms are unaffected.
> 3. achieves unique phys_port_name for representor netdev
> 4. achieves unique mdev eth netdev name for the mdev using
> udev/systemd extension. 5. Aligns well with mdev and netdev subsystem
> and similar to existing sriov bdf's.

A user provided index seems strange to me.  It's not really an index,
just a user specified instance number.  Presumably you have the user
providing this because if it really were an index, then the value
depends on the creation order and persistence is lost.  Now the user
needs to both avoid uuid collision as well as "index" number
collision.  The uuid namespace is large enough to mostly ignore this,
but this is not.  This seems like a burden.

> Option-2: shorter mdev name
> Extend mdev to have shorter mdev device name in addition to UUID.
> such as 'foo', 'bar'.
> Mdev will continue to have UUID.
> phys_port_name=mdev_name
> 
> Pros:
> 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> It is common practice to upgrade iproute2 package along with the
> kernel. Similar practice to be done with mdevctl.
> 2. Newer users of mdevctl who wants to work with non_UUID names, will
> use newer mdevctl/tools. Cons:
> 1. Dual naming scheme of mdev might affect some of the existing tools.
> It's unclear how/if it actually affects.
> mdevctl [2] is very recently developed and can be enhanced for dual
> naming scheme.

I think we've already nak'ed this one, the device namespace becomes
meaningless if the name becomes just a string where a uuid might be an
example string.  mdevs are named by uuid.
 
> Option-3: mdev uuid alias
> Instead of shorter mdev name or mdev index, have alpha-numeric name
> alias. Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> example mdev create command:
> UUID=$(uuidgen)
> echo $UUID alias=foo
> > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > example netdevs:
> examle netdevs:
> repnetdev = ens2f0_mfoo
> mdev_netdev=enmfoo
> 
> Pros:
> 1. All same as option-1.
> 2. Doesn't affect existing mdev naming scheme.
> Cons:
> 1. Index scheme of option-1 is better which can number large number
> of mdevs with fewer characters, simplifying the management tool.

No better than option-1, simply a larger secondary namespace, but still
requires the user to come up with two independent names for the device.

> Option-4: extend IFNAMESZ to be 64 bytes Extended IFNAMESZ from 16 to
> 64 bytes phys_port_name=mdev_UUID_string 

Re: [Pv-drivers] [PATCH 1/4] x86/vmware: Update platform detection code for VMCALL/VMMCALL hypercalls

2019-08-20 Thread Darren Hart

> On Aug 18, 2019, at 12:20 PM, Thomas Gleixner  wrote:
> 
> While at it could you please ask your legal folks whether that custom
> license boilerplate can go away as well?

If you’re referring to the GPL boilerplate with “no warranty" and physical
address, then yes, as a matter of best practice (at VMware), that can and
should all be removed when adding the SPDX identifier - and of course, as
you said, be done as a separate patch.

Thanks,

-- 
Darren Hart


Re: [PATCH] IB/mlx4: Fix memory leaks

2019-08-20 Thread Doug Ledford
On Sun, 2019-08-18 at 15:23 -0500, Wenwen Wang wrote:
> In mlx4_ib_alloc_pv_bufs(), 'tun_qp->tx_ring' is allocated through
> kcalloc(). However, it is not always deallocated in the following
> execution
> if an error occurs, leading to memory leaks. To fix this issue, free
> 'tun_qp->tx_ring' whenever an error occurs.
> 
> Signed-off-by: Wenwen Wang 
> ---

Thanks, applied to for-rc.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t

2019-08-20 Thread Matthew Wilcox
On Tue, Aug 20, 2019 at 07:08:18PM +0200, Sebastian Siewior wrote:
> Bit spinlocks are problematic if PREEMPT_RT is enabled, because they
> disable preemption, which is undesired for latency reasons and breaks when
> regular spinlocks are taken within the bit_spinlock locked region because
> regular spinlocks are converted to 'sleeping spinlocks' on RT. So RT
> replaces the bit spinlocks with regular spinlocks to avoid this problem.
> Bit spinlocks are also not covered by lock debugging, e.g. lockdep.
> 
> Substitute the BH_Uptodate_Lock bit spinlock with a regular spinlock.
> 
> Signed-off-by: Thomas Gleixner 
> [bigeasy: remove the wrapper and use always spinlock_t]

Uhh ... always grow the buffer_head, even for non-PREEMPT_RT?  Why?



Re: [patch V2 0/7] fs: Substitute bit-spinlocks for PREEMPT_RT and debugging

2019-08-20 Thread Sebastian Siewior
On 2019-08-10 01:18:34 [-0700], Christoph Hellwig wrote:
> > > Does SLUB work on -rt at all?
> > 
> > It's the only allocator we support with a few tweaks :)
> 
> What do you do about this particular piece of code there?

This part remains untouched. This "lock" is acquired within ->list_lock
which is a raw_spinlock_t and disables preemption/interrupts on -RT.

Sebastian


Re: Status of Subsystems

2019-08-20 Thread Theodore Y. Ts'o
On Tue, Aug 20, 2019 at 03:56:24PM +0200, Sebastian Duda wrote:
> 
> so the status of the files is inherited from the subsystem `INPUT MULTITOUCH
> (MT) PROTOCOL`?
> 
> Is it the same with the subsystem `NOKIA N900 POWER SUPPLY DRIVERS`
> (respectively `POWER SUPPLY CLASS/SUBSYSTEM and DRIVERS`)?

Note that the definitions of "subsystems" is not necessarily precise.
So assuming there is a strict subclassing and inheritance might not be
a perfect assumption.  There are some files which have no official
owner, and there are also some files which may be modified by more
than one subsystem.

We certainly don't talk about "inheritance" when we talk about
maintainers and sub-maintainers.  Furthermore, the relationships,
processes, and workflows between a particular maintainer and their
submaintainers can be unique to a particular maintainer.

We define these terms to be convenient for Linux development, and like
many human institutions, they can be flexible and messy.  The goal was
*not* define things so it would be convenient for academics writing
papers --- like insects under glass.

Cheers,

- Ted



Re: [PATCH] mips: avoid explicit UB in assignment of mips_io_port_base

2019-08-20 Thread Nick Desaulniers
Hi Paul,
Bumping this thread; we'd really like to be able to boot test another
ISA in our CI.  This lone patch is affecting our ability to boot.  Can
you please pick it up?
https://lore.kernel.org/lkml/20190729211014.39333-1-ndesaulni...@google.com/

On Wed, Aug 7, 2019 at 2:12 PM Nick Desaulniers  wrote:
>
> Sorry for the delayed response, literally sent the patch then went on 
> vacation.
>
> On Mon, Jul 29, 2019 at 3:16 PM Maciej W. Rozycki  
> wrote:
> >
> > On Mon, 29 Jul 2019, Nick Desaulniers wrote:
> >
> > > The code in question is modifying a variable declared const through
> > > pointer manipulation.  Such code is explicitly undefined behavior, and
> > > is the lone issue preventing malta_defconfig from booting when built
> > > with Clang:
> > >
> > > If an attempt is made to modify an object defined with a const-qualified
> > > type through use of an lvalue with non-const-qualified type, the
> > > behavior is undefined.
> > >
> > > LLVM is removing such assignments. A simple fix is to not declare
> > > variables const that you plan on modifying.  Limiting the scope would be
> > > a better method of preventing unwanted writes to such a variable.
>
> This is now documented in the LLVM release notes for Clang-9:
> https://github.com/llvm/llvm-project/commit/e39e79358fcdd5d8ad809defaa821f0bbfa809a5
>
> > >
> > > Further, the code in question mentions "compiler bugs" without any links
> > > to bug reports, so it is difficult to know if the issue is resolved in
> > > GCC. The patch was authored in 2006, which would have been GCC 4.0.3 or
> > > 4.1.1. The minimal supported version of GCC in the Linux kernel is
> > > currently 4.6.
> >
> >  It's somewhat older than that.  My investigation points to:
> >
> > commit c94e57dcd61d661749d53ee876ab265883b0a103
> > Author: Ralf Baechle 
> > Date:   Sun Nov 25 09:25:53 2001 +
> >
> > Cleanup of include/asm-mips/io.h.  Now looks neat and harmless.
>
> Oh indeed, great find!
>
> So it looks to me like the order of events is:
> 1. 
> https://github.com/jaaron/linux-mips-ip30/commit/c94e57dcd61d661749d53ee876ab265883b0a103
> in 2001 first introduces the UB.  mips_io_port_base is defined
> non-const in arch/mips/kernel/setup.c, but then declared extern const
> (and modified via UB) in include/asm-mips/io.h.  A setter is created,
> but not a getter (I'll revisit this below).  This appears to work (due
> to luck) for a few years until:
> 2. 
> https://github.com/mpe/linux-fullhistory/commit/966f4406d903a4214fdc74bec54710c6232a95b8
> in 2006 adds a compiler barrier (reload all variables) and this
> appears to work.  The commit message mentions that reads after
> modification of the const variable were buggy (likely GCC started
> taking advantage of the explicit UB around this time as well).  This
> isn't a fix for UB (more thoughts below), but appears to work.
> 3. 
> https://github.com/llvm/llvm-project/commit/b45631090220b732e614b5530bbd1d230eb9d38e
> in 2019 removes writes to const variables in LLVM as that's explicit
> UB.  We observe the boot failure in mips and narrow it down to this
> instance.
>
> I can see how throwing a compiler barrier in there made subsequent
> reads after UB writes appear to work, but that was more due to luck
> and implementation details of GCC than the heart of the issue (ie. not
> writing code that is explicitly undefined behavior)(and could change
> in future versions of GCC).  Stated another way, the fix for explicit
> UB is not hacks, but avoiding the UB by rewriting the problematic
> code.
>
> > However the purpose of the arrangement does not appear to me to be
> > particularly specific to a compiler version.
> >
> > > For what its worth, there was UB before the commit in question, it just
> > > added a barrier and got lucky IRT codegen. I don't think there's any
> > > actual compiler bugs related, just runtime bugs due to UB.
> >
> >  Does your solution preserves the original purpose of the hack though as
> > documented in the comment you propose to be removed?
>
> The function modified simply writes to a global variable.  It's not
> clear to my why the value about to be modified would EVER be loaded
> before modification.
>
> >  Clearly it was defined enough to work for almost 18 years, so it would be
> > good to keep the optimisation functionally by using different means that
> > do not rely on UB.
>
> "Defined enough" ???
> https://youtu.be/Aq_1l316ow8?t=17
>
> > This variable is assigned at most once throughout the
> > life of the kernel and then early on, so considering it r/w with all the
> > consequences for all accesses does not appear to me to be a good use of
> > it.
>
> Note: it's not possible to express the semantics of a "write once
> variable" in C short of static initialization (AFAIK, without explicit
> violation of UB, but Cunningham's Law may apply).
>
> (set_io_port_base is called in ~20 places)
>
> Thinking more about this while I was away, I think what this code has
> needed since 2001 is proper 

Re: [PATCH] perf/x86: Consider pinned events for group validation

2019-08-20 Thread Liang, Kan

+   /*
+* The new group must can be scheduled
+* together with current pinned events.
+* Otherwise, it will never get a chance
+* to be scheduled later.


That's wrapped short; also I don't think it is sufficient; what if you
happen to have a pinned event on CPU1 (and not others) and happen to run
validation for a new CPU1 event on CPUn ?



The patch doesn't support this case.


Which makes the whole thing even more random.


Maybe we can use the cpuc on event->cpu. That could help a little here.
cpuc = per_cpu_ptr(_hw_events, event->cpu >= 0 ? event->cpu : 
raw_smp_processor_id());





It is mentioned in the description.
The patch doesn't intend to catch all possible cases that cannot be
scheduled. I think it's impossible to catch all cases.
We only want to improve the validate_group() a little bit to catch some
common cases, e.g. NMI watchdog interacting with group.


Also; per that same; it is broken, you're accessing the cpu-local cpuc
without serialization.


Do you mean accessing all cpuc serially?
We only check the cpuc on current CPU here. It doesn't intend to access
other cpuc.


There's nothing preventing the cpuc you're looking at changing while
you're looking at it. Heck, afaict it is possible to UaF here. Nothing
prevents the events you're looking at from going away and getting freed.


You are right.
I think we can add a lock to prevent the event_list[] in x86_pmu_add() 
and x86_pmu_del().



Thanks,
Kan


Re: [PATCH v3 3/4] perf: Use CAP_SYSLOG with kptr_restrict checks

2019-08-20 Thread Arnaldo Carvalho de Melo
Em Mon, Aug 19, 2019 at 10:22:07PM +, Lubashev, Igor escreveu:
> On Mon, August 19, 2019 at 12:51 PM Mathieu Poirier 
>  wrote:
> > On Thu, 15 Aug 2019 at 15:42, Arnaldo Carvalho de Melo 
> >  wrote:
> > Things are working properly on your perf/cap branch.  I tested with on both
> > x86 and ARM.
 
> Mathieu, you are probably testing with euid==0.  If you were to test
> with euid!=0 but with CAP_SYSLOG and no libcap (and kptr_restrict=0,
> perf_event_paranoid=2), you would likely hit the bug that you
> identified in __perf_event__synthesize_kermel_mmap().
 
> See 
> https://lkml.kernel.org/lkml/930a59730c0d495f8c5acf4f99048...@usma1ex-dag1mb6.msg.corp.akamai.com
>  for the fix (Option 1 only or Options 1+2).
> 
> Arnaldo, once we decide what the right fix is, I am happy to post the update 
> (options 1, 1+2) as a patch series.

I think you should get the checks for ref_reloc_sym in place so as to
make the code overall more robust, and also go on continuing to make the
checks in tools/perf/ to match what is checked on the other side of the
mirror, i.e. by the kernel, so from a quick read, please put first the
robustness patches (check ref_reloc_sym) do your other suggestions and
update the warnings, then refresh the two patches that still are not in
my perf/core branch:

[acme@quaco perf]$ git rebase perf/core
First, rewinding head to replay your work on top of it...
Applying: perf tools: Use CAP_SYS_ADMIN with perf_event_paranoid checks
Applying: perf symbols: Use CAP_SYSLOG with kptr_restrict checks
[acme@quaco perf]$ 

I've pushed out perf/cap, so you can go from there as it is rebased on
my current perf/core.

Then test all these cases: with/without libcap, with euid==0 and
different than zero, with capabilities, etc, patch by patch so that we
don't break bisection nor regress,

Thanks and keep up the good work!

- Arnaldo
 
> - Igor
> 
> 
> > > > I am not sure how this can be fixed.  I counted a total of 19
> > > > instances where kmap->ref_reloc_sym->XYZ is called, only 2 of wich
> > > > care to check if kmap->ref_reloc_sym is valid before proceeding.  As
> > > > such I must hope that in the 17 other cases, kmap->ref_reloc_sym is
> > > > guaranteed to be valid.  If I am correct then all we need is to
> > > > check for a valid pointer in _perf_event__synthesize_kernel_mmap().
> > > > Otherwise it will be a little harder.
> > > >
> > > > Mathieu
> 

-- 

- Arnaldo


[PATCH] dt-bindings: arm: Add kryo485 compatible

2019-08-20 Thread Vinod Koul
Kryo485 is found in SM8150, so add it it list of cpu compatibles

Signed-off-by: Vinod Koul 
---
 Documentation/devicetree/bindings/arm/cpus.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/arm/cpus.yaml 
b/Documentation/devicetree/bindings/arm/cpus.yaml
index aa40b074b864..032f759612af 100644
--- a/Documentation/devicetree/bindings/arm/cpus.yaml
+++ b/Documentation/devicetree/bindings/arm/cpus.yaml
@@ -155,6 +155,7 @@ properties:
   - qcom,krait
   - qcom,kryo
   - qcom,kryo385
+  - qcom,kryo485
   - qcom,scorpion
 
   enable-method:
-- 
2.20.1



[PATCH] fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t

2019-08-20 Thread Sebastian Siewior
From: Thomas Gleixner 

Bit spinlocks are problematic if PREEMPT_RT is enabled, because they
disable preemption, which is undesired for latency reasons and breaks when
regular spinlocks are taken within the bit_spinlock locked region because
regular spinlocks are converted to 'sleeping spinlocks' on RT. So RT
replaces the bit spinlocks with regular spinlocks to avoid this problem.
Bit spinlocks are also not covered by lock debugging, e.g. lockdep.

Substitute the BH_Uptodate_Lock bit spinlock with a regular spinlock.

Signed-off-by: Thomas Gleixner 
[bigeasy: remove the wrapper and use always spinlock_t]
Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/buffer.c | 19 +++
 fs/ext4/page-io.c   |  8 +++-
 fs/ntfs/aops.c  |  9 +++--
 include/linux/buffer_head.h |  6 +++---
 4 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 131d39ec7d316..eab37fbaa439f 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -275,8 +275,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
 * decide that the page is now completely done.
 */
first = page_buffers(page);
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, >b_state);
+   spin_lock_irqsave(>uptodate_lock, flags);
clear_buffer_async_read(bh);
unlock_buffer(bh);
tmp = bh;
@@ -289,8 +288,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
}
tmp = tmp->b_this_page;
} while (tmp != bh);
-   bit_spin_unlock(BH_Uptodate_Lock, >b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
 
/*
 * If none of the buffers had errors and they are all
@@ -302,8 +300,7 @@ static void end_buffer_async_read(struct buffer_head *bh, 
int uptodate)
return;
 
 still_busy:
-   bit_spin_unlock(BH_Uptodate_Lock, >b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
return;
 }
 
@@ -331,8 +328,7 @@ void end_buffer_async_write(struct buffer_head *bh, int 
uptodate)
}
 
first = page_buffers(page);
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, >b_state);
+   spin_lock_irqsave(>uptodate_lock, flags);
 
clear_buffer_async_write(bh);
unlock_buffer(bh);
@@ -344,14 +340,12 @@ void end_buffer_async_write(struct buffer_head *bh, int 
uptodate)
}
tmp = tmp->b_this_page;
}
-   bit_spin_unlock(BH_Uptodate_Lock, >b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
end_page_writeback(page);
return;
 
 still_busy:
-   bit_spin_unlock(BH_Uptodate_Lock, >b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
return;
 }
 EXPORT_SYMBOL(end_buffer_async_write);
@@ -3420,6 +3414,7 @@ struct buffer_head *alloc_buffer_head(gfp_t gfp_flags)
struct buffer_head *ret = kmem_cache_zalloc(bh_cachep, gfp_flags);
if (ret) {
INIT_LIST_HEAD(>b_assoc_buffers);
+   spin_lock_init(>uptodate_lock);
preempt_disable();
__this_cpu_inc(bh_accounting.nr);
recalc_bh_state();
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 12ceadef32c5a..7745ed23c6ad9 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -87,11 +87,10 @@ static void ext4_finish_bio(struct bio *bio)
}
bh = head = page_buffers(page);
/*
-* We check all buffers in the page under BH_Uptodate_Lock
+* We check all buffers in the page under uptodate_lock
 * to avoid races with other end io clearing async_write flags
 */
-   local_irq_save(flags);
-   bit_spin_lock(BH_Uptodate_Lock, >b_state);
+   spin_lock_irqsave(>uptodate_lock, flags);
do {
if (bh_offset(bh) < bio_start ||
bh_offset(bh) + bh->b_size > bio_end) {
@@ -103,8 +102,7 @@ static void ext4_finish_bio(struct bio *bio)
if (bio->bi_status)
buffer_io_error(bh);
} while ((bh = bh->b_this_page) != head);
-   bit_spin_unlock(BH_Uptodate_Lock, >b_state);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>uptodate_lock, flags);
if (!under_io) {
fscrypt_free_bounce_page(bounce_page);
end_page_writeback(page);
diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 7202a1e39d70c..14ca433b3a9e4 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -92,8 +92,7 @@ static void ntfs_end_buffer_async_read(struct buffer_head 
*bh, int uptodate)
  

Re: [PATCH v3 4/8] PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port

2019-08-20 Thread Chocron, Jonathan
On Tue, 2019-08-20 at 16:25 +0100, Andrew Murray wrote:
> On Tue, Aug 20, 2019 at 02:52:30PM +, Chocron, Jonathan wrote:
> > On Mon, 2019-08-19 at 19:23 +0100, Andrew Murray wrote:
> > > On Tue, Jul 23, 2019 at 12:25:29PM +0300, Jonathan Chocron wrote:
> > > > The Root Port (identified by [1c36:0032]) doesn't support MSI-
> > > > X. On
> > > > some
> > > 
> > > Shouldn't this read [1c36:0031]?
> > > 
> > 
> > Indeed. Thanks for catching this.
> > 
> > > 
> > > > platforms it is configured to not advertise the capability at
> > > > all,
> > > > while
> > > > on others it (mistakenly) does. This causes a panic during
> > > > initialization by the pcieport driver, since it tries to
> > > > configure
> > > > the
> > > > MSI-X capability. Specifically, when trying to access the MSI-X
> > > > table
> > > > a "non-existing addr" exception occurs.
> > > > 
> > > > Example stacktrace snippet:
> > > > 
> > > > [1.632363] SError Interrupt on CPU2, code 0xbf00 --
> > > > SError
> > > > [1.632364] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-
> > > > rc1-
> > > > Jonny-14847-ge76f1d4a1828-dirty #33
> > > > [1.632365] Hardware name: Annapurna Labs Alpine V3 EVP (DT)
> > > > [1.632365] pstate: 8005 (Nzcv daif -PAN -UAO)
> > > > [1.632366] pc : __pci_enable_msix_range+0x4e4/0x608
> > > > [1.632367] lr : __pci_enable_msix_range+0x498/0x608
> > > > [1.632367] sp : ff80117db700
> > > > [1.632368] x29: ff80117db700 x28: 0001
> > > > [1.632370] x27: 0001 x26: 
> > > > [1.632372] x25: ffd3e9d8c0b0 x24: 
> > > > [1.632373] x23:  x22: 
> > > > [1.632375] x21: 0001 x20: 
> > > > [1.632376] x19: ffd3e9d8c000 x18: 
> > > > [1.632378] x17:  x16: 
> > > > [1.632379] x15: ff80116496c8 x14: ffd3e9844503
> > > > [1.632380] x13: ffd3e9844502 x12: 0038
> > > > [1.632382] x11: ff00 x10: 0040
> > > > [1.632384] x9 : ff801165e270 x8 : ff801165e268
> > > > [1.632385] x7 : 0002 x6 : 00b2
> > > > [1.632387] x5 : ffd3e9d8c2c0 x4 : 
> > > > [1.632388] x3 :  x2 : 
> > > > [1.632390] x1 :  x0 : ffd3e9844680
> > > > [1.632392] Kernel panic - not syncing: Asynchronous SError
> > > > Interrupt
> > > > [1.632393] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-
> > > > rc1-
> > > > Jonny-14847-ge76f1d4a1828-dirty #33
> > > > [1.632394] Hardware name: Annapurna Labs Alpine V3 EVP (DT)
> > > > [1.632394] Call trace:
> > > > [1.632395]  dump_backtrace+0x0/0x140
> > > > [1.632395]  show_stack+0x14/0x20
> > > > [1.632396]  dump_stack+0xa8/0xcc
> > > > [1.632396]  panic+0x140/0x334
> > > > [1.632397]  nmi_panic+0x6c/0x70
> > > > [1.632398]  arm64_serror_panic+0x74/0x88
> > > > [1.632398]  __pte_error+0x0/0x28
> > > > [1.632399]  el1_error+0x84/0xf8
> > > > [1.632400]  __pci_enable_msix_range+0x4e4/0x608
> > > > [1.632400]  pci_alloc_irq_vectors_affinity+0xdc/0x150
> > > > [1.632401]  pcie_port_device_register+0x2b8/0x4e0
> > > > [1.632402]  pcie_portdrv_probe+0x34/0xf0
> > > > 
> > > > Signed-off-by: Jonathan Chocron 
> > > > Reviewed-by: Gustavo Pimentel 
> > > > ---
> > > >  drivers/pci/quirks.c | 15 +++
> > > >  1 file changed, 15 insertions(+)
> > > > 
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 23672680dba7..11f843aa96b3 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -2925,6 +2925,21 @@
> > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x10a1,
> > > > quirk_msi_intx_disable_qca_bug);
> > > >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0xe091,
> > > > quirk_msi_intx_disable_qca_bug);
> > > > +
> > > > +/*
> > > > + * Amazon's Annapurna Labs 1c36:0031 Root Ports don't support
> > > > MSI-
> > > > X, so it
> > > > + * should be disabled on platforms where the device
> > > > (mistakenly)
> > > > advertises it.
> > > > + *
> > > > + * The 0031 device id is reused for other non Root Port device
> > > > types,
> > > > + * therefore the quirk is registered for the
> > > > PCI_CLASS_BRIDGE_PCI
> > > > class.
> > > > + */
> > > > +static void quirk_al_msi_disable(struct pci_dev *dev)
> > > > +{
> > > > +   dev->no_msi = 1;
> > > > +   pci_warn(dev, "Disabling MSI-X\n");
> > > 
> > > This will disable both MSI and MSI-X support - is this really the
> > > intention
> > > here? Do the root ports support MSI and legacy, or just legacy?
> > > 
> > 
> > The HW should support MSI, but we currently don't have a use case
> > for
> > it so it hasn't been tested and therefore we are okay with
> > disabling
> > it.
> 
> OK - then the 

RE: [PATCH v3 0/8] thunderbolt: Intel Ice Lake support

2019-08-20 Thread Mario.Limonciello
> -Original Message-
> From: Lukas Wunner 
> Sent: Tuesday, August 20, 2019 6:34 AM
> To: Limonciello, Mario
> Cc: mika.westerb...@linux.intel.com; linux-kernel@vger.kernel.org;
> andreas.noe...@gmail.com; michael.ja...@intel.com;
> yehezkel...@gmail.com; r...@rjwysocki.net; l...@kernel.org;
> anthony.w...@canonical.com; rajmohan.m...@intel.com;
> raanan.avar...@intel.com; david.lai...@aculab.com; linux-
> a...@vger.kernel.org
> Subject: Re: [PATCH v3 0/8] thunderbolt: Intel Ice Lake support
> 
> 
> [EXTERNAL EMAIL]
> 
> On Mon, Aug 19, 2019 at 04:29:35PM +, mario.limoncie...@dell.com wrote:
> > I've run into a problem when using
> > a WD19TB that after unplugging it will cause the following to spew in dmesg:
> >
> > [ 2198.017003] 
> > [ 2198.017005] WARNING: possible recursive locking detected
> > [ 2198.017008] 5.3.0-rc5+ #75 Not tainted
> > [ 2198.017009] 
> > [ 2198.017012] irq/122-pciehp/121 is trying to acquire lock:
> > [ 2198.017015] 801d4de8 (>reset_lock){.+.+}, at:
> pciehp_check_presence+0x1b/0x80
> > [ 2198.017026]
> >but task is already holding lock:
> > [ 2198.017028] 0899e2eb (>reset_lock){.+.+}, at:
> pciehp_ist+0xaf/0x1c0
> 
> This was first reported by Theodore in April and appears to be a
> false positive:
> 
> https://lore.kernel.org/linux-
> pci/20190402083257.kyqmirq4ovzsc...@wunner.de/
> 
> Thanks,
> 
> Lukas

Indeed it does actually seem harmless and only comes up once.

I've filed https://bugzilla.kernel.org/show_bug.cgi?id=204639 to track down 
further
what's going on.


Re: [PATCH v6 3/4] dt-bindings: arm: fsl: Add Kontron i.MX6UL N6310 compatibles

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 10:35 AM Krzysztof Kozlowski  wrote:
>
> Add the compatibles for Kontron i.MX6UL N6310 SoM and boards.
>
> Signed-off-by: Krzysztof Kozlowski 
>
> ---
>
> Changes since v5:
> New patch
> ---
>  Documentation/devicetree/bindings/arm/fsl.yaml | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml 
> b/Documentation/devicetree/bindings/arm/fsl.yaml
> index 7294ac36f4c0..d07b3c06d7cf 100644
> --- a/Documentation/devicetree/bindings/arm/fsl.yaml
> +++ b/Documentation/devicetree/bindings/arm/fsl.yaml
> @@ -161,6 +161,9 @@ properties:
>  items:
>- enum:
>- fsl,imx6ul-14x14-evk  # i.MX6 UltraLite 14x14 EVK Board
> +  - kontron,imx6ul-n6310-som  # Kontron N6310 SOM
> +  - kontron,imx6ul-n6310-s# Kontron N6310 S Board
> +  - kontron,imx6ul-n6310-s-43 # Kontron N6310 S 43 Board

This doesn't match what is in your dts files. Run 'make dtbs_check' and see.

>- const: fsl,imx6ul
>
>- description: i.MX6ULL based Boards
> --
> 2.7.4
>


Re: [PATCH 2/2] uacce: add uacce module

2019-08-20 Thread Greg Kroah-Hartman
On Tue, Aug 20, 2019 at 09:08:55PM +0800, zhangfei wrote:
> 
> 
> On 2019/8/15 下午10:13, Greg Kroah-Hartman wrote:
> > On Wed, Aug 14, 2019 at 05:34:25PM +0800, Zhangfei Gao wrote:
> > > +int uacce_register(struct uacce *uacce)
> > > +{
> > > + int ret;
> > > +
> > > + if (!uacce->pdev) {
> > > + pr_debug("uacce parent device not set\n");
> > > + return -ENODEV;
> > > + }
> > > +
> > > + if (uacce->flags & UACCE_DEV_NOIOMMU) {
> > > + add_taint(TAINT_CRAP, LOCKDEP_STILL_OK);
> > > + dev_warn(uacce->pdev,
> > > +  "Register to noiommu mode, which export kernel data to 
> > > user space and may vulnerable to attack");
> > > + }
> > THat is odd, why even offer this feature then if it is a major issue?
> UACCE_DEV_NOIOMMU maybe confusing here.
> 
> In this mode, app use ioctl to get dma_handle from dma_alloc_coherent.

That's odd, why not use the other default apis to do that?

> It does not matter iommu is enabled or not.
> In case iommu is disabled, it maybe dangerous to kernel, so we added warning 
> here, is it required?

You should use the other documentated apis for this, don't create your
own.

thanks,

greg k-h


Re: [PATCH v6,1/2] PCI: hv: Detect and fix Hyper-V PCI domain number collision

2019-08-20 Thread Lorenzo Pieralisi
On Thu, Aug 15, 2019 at 05:01:37PM +, Haiyang Zhang wrote:
> Currently in Azure cloud, for passthrough devices, the host sets the device
> instance ID's bytes 8 - 15 to a value derived from the host HWID, which is
> the same on all devices in a VM. So, the device instance ID's bytes 8 and 9
> provided by the host are no longer unique. This affects all Azure hosts
> since July 2018, and can cause device passthrough to VMs to fail because
> the bytes 8 and 9 are used as PCI domain number. Collision of domain
> numbers will cause the second device with the same domain number fail to
> load.
> 
> In the cases of collision, we will detect and find another number that is
> not in use.
> 
> Suggested-by: Michael Kelley 
> Signed-off-by: Haiyang Zhang 
> Acked-by: Sasha Levin 
> ---
>  drivers/pci/controller/pci-hyperv.c | 92 
> +++--
>  1 file changed, 79 insertions(+), 13 deletions(-)

I have applied both patches to pci/hv for v5.4.

Thanks,
Lorenzo

> diff --git a/drivers/pci/controller/pci-hyperv.c 
> b/drivers/pci/controller/pci-hyperv.c
> index 40b6254..31b8fd5 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -2510,6 +2510,48 @@ static void put_hvpcibus(struct hv_pcibus_device *hbus)
>   complete(>remove_event);
>  }
>  
> +#define HVPCI_DOM_MAP_SIZE (64 * 1024)
> +static DECLARE_BITMAP(hvpci_dom_map, HVPCI_DOM_MAP_SIZE);
> +
> +/*
> + * PCI domain number 0 is used by emulated devices on Gen1 VMs, so define 0
> + * as invalid for passthrough PCI devices of this driver.
> + */
> +#define HVPCI_DOM_INVALID 0
> +
> +/**
> + * hv_get_dom_num() - Get a valid PCI domain number
> + * Check if the PCI domain number is in use, and return another number if
> + * it is in use.
> + *
> + * @dom: Requested domain number
> + *
> + * return: domain number on success, HVPCI_DOM_INVALID on failure
> + */
> +static u16 hv_get_dom_num(u16 dom)
> +{
> + unsigned int i;
> +
> + if (test_and_set_bit(dom, hvpci_dom_map) == 0)
> + return dom;
> +
> + for_each_clear_bit(i, hvpci_dom_map, HVPCI_DOM_MAP_SIZE) {
> + if (test_and_set_bit(i, hvpci_dom_map) == 0)
> + return i;
> + }
> +
> + return HVPCI_DOM_INVALID;
> +}
> +
> +/**
> + * hv_put_dom_num() - Mark the PCI domain number as free
> + * @dom: Domain number to be freed
> + */
> +static void hv_put_dom_num(u16 dom)
> +{
> + clear_bit(dom, hvpci_dom_map);
> +}
> +
>  /**
>   * hv_pci_probe() - New VMBus channel probe, for a root PCI bus
>   * @hdev:VMBus's tracking struct for this root PCI bus
> @@ -2521,6 +2563,7 @@ static int hv_pci_probe(struct hv_device *hdev,
>   const struct hv_vmbus_device_id *dev_id)
>  {
>   struct hv_pcibus_device *hbus;
> + u16 dom_req, dom;
>   int ret;
>  
>   /*
> @@ -2535,19 +2578,34 @@ static int hv_pci_probe(struct hv_device *hdev,
>   hbus->state = hv_pcibus_init;
>  
>   /*
> -  * The PCI bus "domain" is what is called "segment" in ACPI and
> -  * other specs.  Pull it from the instance ID, to get something
> -  * unique.  Bytes 8 and 9 are what is used in Windows guests, so
> -  * do the same thing for consistency.  Note that, since this code
> -  * only runs in a Hyper-V VM, Hyper-V can (and does) guarantee
> -  * that (1) the only domain in use for something that looks like
> -  * a physical PCI bus (which is actually emulated by the
> -  * hypervisor) is domain 0 and (2) there will be no overlap
> -  * between domains derived from these instance IDs in the same
> -  * VM.
> +  * The PCI bus "domain" is what is called "segment" in ACPI and other
> +  * specs. Pull it from the instance ID, to get something usually
> +  * unique. In rare cases of collision, we will find out another number
> +  * not in use.
> +  *
> +  * Note that, since this code only runs in a Hyper-V VM, Hyper-V
> +  * together with this guest driver can guarantee that (1) The only
> +  * domain used by Gen1 VMs for something that looks like a physical
> +  * PCI bus (which is actually emulated by the hypervisor) is domain 0.
> +  * (2) There will be no overlap between domains (after fixing possible
> +  * collisions) in the same VM.
>*/
> - hbus->sysdata.domain = hdev->dev_instance.b[9] |
> -hdev->dev_instance.b[8] << 8;
> + dom_req = hdev->dev_instance.b[8] << 8 | hdev->dev_instance.b[9];
> + dom = hv_get_dom_num(dom_req);
> +
> + if (dom == HVPCI_DOM_INVALID) {
> + dev_err(>device,
> + "Unable to use dom# 0x%hx or other numbers", dom_req);
> + ret = -EINVAL;
> + goto free_bus;
> + }
> +
> + if (dom != dom_req)
> + dev_info(>device,
> +  "PCI dom# 0x%hx has collision, using 0x%hx",
> +  dom_req, dom);
> +
> + 

Re: [PATCH] RDMA/hns: Fix some white space check_mtu_validate()

2019-08-20 Thread Doug Ledford
On Fri, 2019-08-16 at 14:39 +0300, Dan Carpenter wrote:
> This line was indented a bit too far.
> 
> Signed-off-by: Dan Carpenter 

Thanks, applied to for-next.

-- 
Doug Ledford 
GPG KeyID: B826A3330E572FDD
Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v3 3/4] perf: Use CAP_SYSLOG with kptr_restrict checks

2019-08-20 Thread Mathieu Poirier
On Mon, 19 Aug 2019 at 16:22, Lubashev, Igor  wrote:
>
> On Mon, August 19, 2019 at 12:51 PM Mathieu Poirier 
>  wrote:
> > On Thu, 15 Aug 2019 at 15:42, Arnaldo Carvalho de Melo
> >  wrote:
> > >
> > > Em Thu, Aug 15, 2019 at 02:16:48PM -0600, Mathieu Poirier escreveu:
> > > > On Wed, 14 Aug 2019 at 14:02, Lubashev, Igor 
> > wrote:
> > > > >
> > > > > > On Wed, August 14, 2019 at 2:52 PM Arnaldo Carvalho de Melo
> >  wrote:
> > > > > > Em Wed, Aug 14, 2019 at 03:48:14PM -0300, Arnaldo Carvalho de
> > > > > > Melo
> > > > > > escreveu:
> > > > > > > Em Wed, Aug 14, 2019 at 12:04:33PM -0600, Mathieu Poirier
> > escreveu:
> > > > > > > > # echo 0 > /proc/sys/kernel/kptr_restrict #
> > > > > > > > ./tools/perf/perf record -e instructions:k uname
> > > > > > > > perf: Segmentation fault
> > > > > > > > Obtained 10 stack frames.
> > > > > > > > ./tools/perf/perf(sighandler_dump_stack+0x44)
> > > > > > > > [0x55af9e5da5d4]
> > > > > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7fd31efb6f20]
> > > > > > > > ./tools/perf/perf(perf_event__synthesize_kernel_mmap+0xa7)
> > > > > > > > [0x55af9e590337]
> > > > > > > > ./tools/perf/perf(+0x1cf5be) [0x55af9e50c5be]
> > > > > > > > ./tools/perf/perf(cmd_record+0x1022) [0x55af9e50dff2]
> > > > > > > > ./tools/perf/perf(+0x23f98d) [0x55af9e57c98d]
> > > > > > > > ./tools/perf/perf(+0x23fc9e) [0x55af9e57cc9e]
> > > > > > > > ./tools/perf/perf(main+0x369) [0x55af9e4f6bc9]
> > > > > > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)
> > > > > > > > [0x7fd31ef99b97]
> > > > > > > > ./tools/perf/perf(_start+0x2a) [0x55af9e4f704a] Segmentation
> > > > > > > > fault
> > > > > > > >
> > > > > > > > I can reproduce this on both x86 and ARM64.
> > > > > > >
> > > > > > > I don't see this with these two csets removed:
> > > > > > >
> > > > > > > 7ff5b5911144 perf symbols: Use CAP_SYSLOG with kptr_restrict
> > > > > > > checks d7604b66102e perf tools: Use CAP_SYS_ADMIN with
> > > > > > > perf_event_paranoid checks
> > > > > > >
> > > > > > > Which were the ones I guessed were related to the problem you
> > > > > > > reported, so they are out of my ongoing perf/core pull request
> > > > > > > to Ingo/Thomas, now trying with these applied and your
> > instructions...
> > > > > >
> > > > > > Can't repro:
> > > > > >
> > > > > > [root@quaco ~]# cat /proc/sys/kernel/kptr_restrict
> > > > > > 0
> > > > > > [root@quaco ~]# perf record -e instructions:k uname Linux [ perf
> > record:
> > > > > > Woken up 1 times to write data ] [ perf record: Captured and
> > > > > > wrote 0.024 MB perf.data (1 samples) ] [root@quaco ~]# echo 1 >
> > > > > > /proc/sys/kernel/kptr_restrict [root@quaco ~]# perf record -e
> > > > > > instructions:k uname Linux [ perf record: Woken up 1 times to write
> > data ] [ perf record:
> > > > > > Captured and wrote 0.024 MB perf.data (1 samples) ] [root@quaco
> > > > > > ~]# echo
> > > > > > 0 > /proc/sys/kernel/kptr_restrict [root@quaco ~]# perf record
> > > > > > -e instructions:k uname Linux [ perf record: Woken up 1 times to
> > > > > > write data ] [ perf record: Captured and wrote 0.024 MB
> > > > > > perf.data (1 samples) ] [root@quaco ~]#
> > > > > >
> > > > > > [acme@quaco perf]$ git log --oneline --author Lubashev tools/
> > > > > > 7ff5b5911144 (HEAD -> perf/cap, acme.korg/tmp.perf/cap,
> > > > > > acme.korg/perf/cap) perf symbols: Use CAP_SYSLOG with
> > > > > > kptr_restrict checks d7604b66102e perf tools: Use CAP_SYS_ADMIN
> > > > > > with perf_event_paranoid checks c766f3df635d perf ftrace: Use
> > > > > > CAP_SYS_ADMIN instead of euid==0 c22e150e3afa perf tools: Add
> > > > > > helpers to use capabilities if present
> > > > > > 74d5f3d06f70 tools build: Add capability-related feature
> > > > > > detection perf version 5.3.rc4.g7ff5b5911144 [acme@quaco perf]$
> > > > >
> > > > > I got an ARM64 cloud VM, but I cannot reproduce.
> > > > > # cat /proc/sys/kernel/kptr_restrict
> > > > > 0
> > > > >
> > > > > Perf trace works fine (does not die):
> > > > > # ./perf trace -a
> > > > >
> > > > > Here is my setup:
> > > > > Repo: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
> > > > > Branch: tmp.perf/cap
> > > > > Commit: 7ff5b5911 "perf symbols: Use CAP_SYSLOG with kptr_restrict
> > checks"
> > > > > gcc --version: gcc (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1) 7.4.0
> > > > > uname -a: Linux arm-4-par-1 4.9.93-mainline-rev1 #1 SMP Tue Apr 10
> > > > > 09:54:46 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux lsb_release
> > > > > -a: Ubuntu 18.04.3 LTS
> > > > >
> > > > > Auto-detecting system features:
> > > > > ... dwarf: [ on  ]
> > > > > ...dwarf_getlocations: [ on  ]
> > > > > ... glibc: [ on  ]
> > > > > ...  gtk2: [ on  ]
> > > > > ...  libaudit: [ on  ]
> > > > > ...libbfd: [ on  ]
> > > > > ...libcap: [ on  ]
> > > > > ...libelf: [ on  ]
> > > > > ...  

[PATCH v6] ata/pata_buddha: Probe via modalias instead of initcall

2019-08-20 Thread Max Staudt
Up until now, the pata_buddha driver would only check for cards on
initcall time. Now, the kernel will call its probe function as soon
as a compatible card is detected.

v6: Only do the drvdata workaround for X-Surf (remove breaks otherwise)
Style

v5: Remove module_exit(): There's no good way to handle the X-Surf hack.
Also include a workaround to save X-Surf's drvdata in case zorro8390
is active.

v4: Clean up pata_buddha_probe() by using ent->driver_data.
Support X-Surf via late_initcall()

v3: Clean up devm_*, implement device removal.

v2: Rename 'zdev' to 'z' to make the patch easy to analyse with
git diff --ignore-space-change

Signed-off-by: Max Staudt 
---
 drivers/ata/pata_buddha.c | 231 +++---
 1 file changed, 138 insertions(+), 93 deletions(-)

diff --git a/drivers/ata/pata_buddha.c b/drivers/ata/pata_buddha.c
index 11a8044ff..9e1b57866 100644
--- a/drivers/ata/pata_buddha.c
+++ b/drivers/ata/pata_buddha.c
@@ -18,7 +18,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,7 +31,7 @@
 #include 
 
 #define DRV_NAME "pata_buddha"
-#define DRV_VERSION "0.1.0"
+#define DRV_VERSION "0.1.1"
 
 #define BUDDHA_BASE1   0x800
 #define BUDDHA_BASE2   0xa00
@@ -47,11 +49,11 @@ enum {
BOARD_XSURF
 };
 
-static unsigned int buddha_bases[3] __initdata = {
+static unsigned int buddha_bases[3] = {
BUDDHA_BASE1, BUDDHA_BASE2, BUDDHA_BASE3
 };
 
-static unsigned int xsurf_bases[2] __initdata = {
+static unsigned int xsurf_bases[2] = {
XSURF_BASE1, XSURF_BASE2
 };
 
@@ -145,111 +147,154 @@ static struct ata_port_operations pata_xsurf_ops = {
.set_mode   = pata_buddha_set_mode,
 };
 
-static int __init pata_buddha_init_one(void)
+static int pata_buddha_probe(struct zorro_dev *z,
+const struct zorro_device_id *ent)
 {
-   struct zorro_dev *z = NULL;
+   static const char * const board_name[] = {
+   "Buddha", "Catweasel", "X-Surf"
+   };
+   struct ata_host *host;
+   void __iomem *buddha_board;
+   unsigned long board;
+   unsigned int type = ent->driver_data;
+   unsigned int nr_ports = (type == BOARD_CATWEASEL) ? 3 : 2;
+   void *old_drvdata;
+   int i;
+
+   dev_info(>dev, "%s IDE controller\n", board_name[type]);
+
+   board = z->resource.start;
+
+   if (type != BOARD_XSURF) {
+   if (!devm_request_mem_region(>dev,
+board + BUDDHA_BASE1,
+0x800, DRV_NAME))
+   return -ENXIO;
+   } else {
+   if (!devm_request_mem_region(>dev,
+board + XSURF_BASE1,
+0x1000, DRV_NAME))
+   return -ENXIO;
+   if (!devm_request_mem_region(>dev,
+board + XSURF_BASE2,
+0x1000, DRV_NAME)) {
+   }
+   }
+
+   /* Workaround for X-Surf: Save drvdata in case zorro8390 has set it */
+   if (type == BOARD_XSURF)
+   old_drvdata = dev_get_drvdata(>dev);
+
+   /* allocate host */
+   host = ata_host_alloc(>dev, nr_ports);
+   if (type == BOARD_XSURF)
+   dev_set_drvdata(>dev, old_drvdata);
+   if (!host)
+   return -ENXIO;
+
+   buddha_board = ZTWO_VADDR(board);
+
+   /* enable the board IRQ on Buddha/Catweasel */
+   if (type != BOARD_XSURF)
+   z_writeb(0, buddha_board + BUDDHA_IRQ_MR);
 
-   while ((z = zorro_find_device(ZORRO_WILDCARD, z))) {
-   static const char *board_name[]
-   = { "Buddha", "Catweasel", "X-Surf" };
-   struct ata_host *host;
-   void __iomem *buddha_board;
-   unsigned long board;
-   unsigned int type, nr_ports = 2;
-   int i;
-
-   if (z->id == ZORRO_PROD_INDIVIDUAL_COMPUTERS_BUDDHA) {
-   type = BOARD_BUDDHA;
-   } else if (z->id == ZORRO_PROD_INDIVIDUAL_COMPUTERS_CATWEASEL) {
-   type = BOARD_CATWEASEL;
-   nr_ports++;
-   } else if (z->id == ZORRO_PROD_INDIVIDUAL_COMPUTERS_X_SURF) {
-   type = BOARD_XSURF;
-   } else
-   continue;
-
-   dev_info(>dev, "%s IDE controller\n", board_name[type]);
-
-   board = z->resource.start;
+   for (i = 0; i < nr_ports; i++) {
+   struct ata_port *ap = host->ports[i];
+   void __iomem *base, *irqport;
+   unsigned long ctl = 0;
 
if (type != BOARD_XSURF) {
-   if (!devm_request_mem_region(>dev,
-board + 

Re: [PATCH 2/6] dt-bindings: net: sun8i-a83t-emac: Add phy-io-supply property

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 11:34 AM Ondřej Jirman  wrote:
>
> On Tue, Aug 20, 2019 at 11:20:22AM -0500, Rob Herring wrote:
> > On Tue, Aug 20, 2019 at 9:53 AM  wrote:
> > >
> > > From: Ondrej Jirman 
> > >
> > > Some PHYs require separate power supply for I/O pins in some modes
> > > of operation. Add phy-io-supply property, to allow enabling this
> > > power supply.
> >
> > Perhaps since this is new, such phys should have *-supply in their nodes.
>
> Yes, I just don't understand, since external ethernet phys are so common,
> and they require power, how there's no fairly generic mechanism for this
> already in the PHY subsystem, or somewhere?

Because generic mechanisms for this don't work. For example, what
happens when the 2 supplies need to be turned on in a certain order
and with certain timings? And then add in reset or control lines into
the mix... You can see in the bindings we already have some of that.

> It looks like other ethernet mac drivers also implement supplies on phys
> on the EMAC nodes. Just grep phy-supply through dt-bindings/net.
>
> Historical reasons, or am I missing something? It almost seems like I must
> be missing something, since putting these properties to phy nodes
> seems so obvious.

Things get added one by one and one new property isn't that
controversial. We've generally learned the lesson and avoid this
pattern now, but ethernet phys are one of the older bindings.

Rob


Re: [PATCH net-next v3 2/4] net: mdio: add PTP offset compensation to mdiobus_write_sts

2019-08-20 Thread Hubert Feurstein
Am Di., 20. Aug. 2019 um 17:40 Uhr schrieb Miroslav Lichvar
:
>
> On Tue, Aug 20, 2019 at 05:23:06PM +0200, Andrew Lunn wrote:
> > > - take a second "post" system timestamp after the completion
> >
> > For this hardware, completion is an interrupt, which has a lot of
> > jitter on it. But this hardware is odd, in that it uses an
> > interrupt. Every other MDIO bus controller uses polled IO, with an
> > mdelay(10) or similar between each poll. So the jitter is going to be
> > much larger.
>
> I think a large jitter is ok in this case. We just need to timestamp
> something that we know for sure happened after the PHC timestamp. It
> should have no impact on the offset and its stability, just the
> reported delay. A test with phc2sys should be able to confirm that.
> phc2sys selects the measurement with the shortest delay, which has
> least uncertainty. I'd say that applies to both interrupt and polling.
>
> If it is difficult to specify the minimum interrupt delay, I'd still
> prefer an overly pessimistic interval assuming a zero delay.
>
Currently I do not see the benefit from this. The original intention was to
compensate for the remaining offset as good as possible. The current code
of phc2sys uses the delay only for the filtering of the measurement record
with the shortest delay and for reporting and statistics. Why not simple shift
the timestamps with the offset to the point where we expect the PHC timestamp
to be captured, and we have a very good result compared to where we came
from.

Hubert


Re: [PATCH v2] x86/mm/pti: in pti_clone_pgtable() don't increase addr by PUD_SIZE

2019-08-20 Thread Rik van Riel
On Tue, 2019-08-20 at 10:00 -0400, Song Liu wrote:
> 
> From 9ae74cff4faf4710a11cb8da4c4a3f3404bd9fdd Mon Sep 17 00:00:00
> 2001
> From: Song Liu 
> Date: Mon, 19 Aug 2019 23:59:47 -0700
> Subject: [PATCH] x86/mm/pti: in pti_clone_pgtable(), increase addr
> properly
> 
> Before 32-bit support, pti_clone_pmds() always adds PMD_SIZE to addr.
> This behavior changes after the 32-bit support:  pti_clone_pgtable()
> increases addr by PUD_SIZE for pud_none(*pud) case, and increases
> addr by
> PMD_SIZE for pmd_none(*pmd) case. However, this is not accurate
> because
> addr may not be PUD_SIZE/PMD_SIZE aligned.
> 
> Fix this issue by properly rounding up addr to next PUD_SIZE/PMD_SIZE
> in these two cases.
> 
> Cc: sta...@vger.kernel.org # v4.19+
> Fixes: 16a3fe634f6a ("x86/mm/pti: Clone kernel-image on PTE level for
> 32 bit")
> Signed-off-by: Song Liu 
> Cc: Joerg Roedel 
> Cc: Thomas Gleixner 
> Cc: Dave Hansen 
> Cc: Andy Lutomirski 
> Cc: Peter Zijlstra 

This looks like it should do the trick!

Reviewed-by: Rik van Riel 

-- 
All Rights Reversed.


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v5] ata/pata_buddha: Probe via modalias instead of initcall

2019-08-20 Thread Bartlomiej Zolnierkiewicz


On 8/20/19 6:42 PM, Max Staudt wrote:
> Hi Bartlomiej,
> 
> On 08/20/2019 02:06 PM, Bartlomiej Zolnierkiewicz wrote:
>> WARNING: line over 80 characters
>> #354: FILE: drivers/ata/pata_buddha.c:287:
>> +while ((z = 
>> zorro_find_device(ZORRO_PROD_INDIVIDUAL_COMPUTERS_X_SURF, z))) {
> 
> I see no good way to shorten this one. I think it's obvious enough to allow 
> overflowing by a few chars - do you agree?

Yes, I agree.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R Institute Poland
Samsung Electronics


Re: [PATCH v5] ata/pata_buddha: Probe via modalias instead of initcall

2019-08-20 Thread Bartlomiej Zolnierkiewicz


On 8/20/19 5:59 PM, Max Staudt wrote:
> Hi Bartlomiej,
> 
> Thank you very much for your review!
> 
> Question below.
> 
> 
> On 08/20/2019 02:06 PM, Bartlomiej Zolnierkiewicz wrote:
>>> +   /* Workaround for X-Surf: Save drvdata in case zorro8390 has set it */
>>> +   old_drvdata = dev_get_drvdata(>dev);
>>
>> This should be done only for type == BOARD_XSURF.
> 
> Agreed, as I want to keep unloading functional for Buddha/Catweasel - see 
> below.
> 
> 
>>> +static struct zorro_driver pata_buddha_driver = {
>>> +   .name   = "pata_buddha",
>>> +   .id_table   = pata_buddha_zorro_tbl,
>>> +   .probe  = pata_buddha_probe,
>>> +   .remove = pata_buddha_remove,
>>
>> I think that we should also add:
>>
>>  .driver  = {
>>  .suppress_bind_attrs = true,
>>  },
>>
>> to prevent the device from being unbinded (and thus ->remove called)
>> from the driver using sysfs interface.
> 
> Interesting idea - here's my question now:
> 
> My intention is to allow remove() for boards where we support IDE only 
> (Buddha, Catweasel) - these are autoprobed via zorro_register_driver().
> This shouldn't affect the X-Surf case, as it's not autoprobed in this way 
> anyway - and thus pata_buddha_driver isn't even used.
> 
> Am I missing something? We want to inhibit module unloading (hence no 
> module_exit()), but driver unbinding for Buddha/Catweasel should be fine to 
> remain, right?

Indeed, pata_buddha_driver is not even used for X-Surf so this is not
an issue (please disregard my comment about suppress_bind_attrs).

>> Please also always check your patches with scripts/checkpatch.pl and
>> fix the reported issues:
> 
> Apologies, must've been something in my coffee. I will.
> 
> 
> Thanks for the review, I'll send a new patch once my question above is 
> resolved.
> 
> Max
Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R Institute Poland
Samsung Electronics


Re: [PATCH 00/14] per memcg lru_lock

2019-08-20 Thread Shakeel Butt
On Tue, Aug 20, 2019 at 3:45 AM Michal Hocko  wrote:
>
> On Tue 20-08-19 17:48:23, Alex Shi wrote:
> > This patchset move lru_lock into lruvec, give a lru_lock for each of
> > lruvec, thus bring a lru_lock for each of memcg.
> >
> > Per memcg lru_lock would ease the lru_lock contention a lot in
> > this patch series.
> >
> > In some data center, containers are used widely to deploy different kind
> > of services, then multiple memcgs share per node pgdat->lru_lock which
> > cause heavy lock contentions when doing lru operation.
>
> Having some real world workloads numbers would be more than useful
> for a non trivial change like this. I believe googlers have tried
> something like this in the past but then didn't have really a good
> example of workloads that benefit. I might misremember though. Cc Hugh.
>

We, at Google, have been using per-memcg lru locks for more than 7
years. Per-memcg lru locks are really beneficial for providing
performance isolation if there are multiple distinct jobs/memcgs
running on large machines. We are planning to upstream our internal
implementation. I will let Hugh comment on that.

thanks,
Shakeel


[PATCH v2 7/7] bug: Move WARN_ON() "cut here" into exception handler

2019-08-20 Thread Kees Cook
The original clean up of "cut here" missed the WARN_ON() case (that
does not have a printk message), which was fixed recently by adding
an explicit printk of "cut here". This had the downside of adding a
printk() to every WARN_ON() caller, which reduces the utility of using
an instruction exception to streamline the resulting code. By making
this a new BUGFLAG, all of these can be removed and "cut here" can be
handled by the exception handler.

This was very pronounced on PowerPC, but the effect can be seen on
x86 as well. The resulting text size of a defconfig build shows some
small savings from this patch:

   textdata bss dec hex filename
196911675134320 1646664 26472151193eed7 vmlinux.before
196763625134260 1663048 26473670193f4c6 vmlinux.after

This change also opens the door for creating something like BUG_MSG(),
where a custom printk() before issuing BUG(), without confusing the "cut
here" line.

Reported-by: Christophe Leroy 
Fixes: Fixes: 6b15f678fb7d ("include/asm-generic/bug.h: fix "cut here" for 
WARN_ON for __WARN_TAINT architectures")
Signed-off-by: Kees Cook 
---
v2:
 - rename BUGFLAG_PRINTK to BUGFLAG_NO_CUT_HERE (peterz, christophe)
---
 include/asm-generic/bug.h |  8 +++-
 lib/bug.c | 11 +--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 588dd59a5b72..a21e83f8a274 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -10,6 +10,7 @@
 #define BUGFLAG_WARNING(1 << 0)
 #define BUGFLAG_ONCE   (1 << 1)
 #define BUGFLAG_DONE   (1 << 2)
+#define BUGFLAG_NO_CUT_HERE(1 << 3)/* CUT_HERE already sent */
 #define BUGFLAG_TAINT(taint)   ((taint) << 8)
 #define BUG_GET_TAINT(bug) ((bug)->flags >> 8)
 #endif
@@ -86,13 +87,10 @@ void warn_slowpath_fmt(const char *file, const int line, 
unsigned taint,
warn_slowpath_fmt(__FILE__, __LINE__, taint, arg)
 #else
 extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
-#define __WARN() do {  \
-   printk(KERN_WARNING CUT_HERE);  \
-   __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN));\
-   } while (0)
+#define __WARN()   __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
 #define __WARN_printf(taint, arg...) do {  \
__warn_printk(arg); \
-   __WARN_FLAGS(BUGFLAG_TAINT(taint)); \
+   __WARN_FLAGS(BUGFLAG_NO_CUT_HERE | BUGFLAG_TAINT(taint));\
} while (0)
 #define WARN_ON_ONCE(condition) ({ \
int __ret_warn_on = !!(condition);  \
diff --git a/lib/bug.c b/lib/bug.c
index 1077366f496b..8c98af0bf585 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -181,6 +181,15 @@ enum bug_trap_type report_bug(unsigned long bugaddr, 
struct pt_regs *regs)
}
}
 
+   /*
+* BUG() and WARN_ON() families don't print a custom debug message
+* before triggering the exception handler, so we must add the
+* "cut here" line now. WARN() issues its own "cut here" before the
+* extra debugging message it writes before triggering the handler.
+*/
+   if ((bug->flags & BUGFLAG_NO_CUT_HERE) == 0)
+   printk(KERN_DEFAULT CUT_HERE);
+
if (warning) {
/* this is a WARN_ON rather than BUG/BUG_ON */
__warn(file, line, (void *)bugaddr, BUG_GET_TAINT(bug), regs,
@@ -188,8 +197,6 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct 
pt_regs *regs)
return BUG_TRAP_TYPE_WARN;
}
 
-   printk(KERN_DEFAULT CUT_HERE);
-
if (file)
pr_crit("kernel BUG at %s:%u!\n", file, line);
else
-- 
2.17.1


-- 
Kees Cook


Re: [PATCH 4.14 04/33] tcp: be more careful in tcp_fragment()

2019-08-20 Thread Matthieu Baerts
Hi Eric,

On 08/08/2019 21:05, Greg Kroah-Hartman wrote:
> commit b617158dc096709d8600c53b6052144d12b89fab upstream.
> 
> Some applications set tiny SO_SNDBUF values and expect
> TCP to just work. Recent patches to address CVE-2019-11478
> broke them in case of losses, since retransmits might
> be prevented.
> 
> We should allow these flows to make progress.
> 
> This patch allows the first and last skb in retransmit queue
> to be split even if memory limits are hit.
> 
> It also adds the some room due to the fact that tcp_sendmsg()
> and tcp_sendpage() might overshoot sk_wmem_queued by about one full
> TSO skb (64KB size). Note this allowance was already present
> in stable backports for kernels < 4.15
> 
> Note for < 4.15 backports :
>  tcp_rtx_queue_tail() will probably look like :
> 
> static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
> {
>   struct sk_buff *skb = tcp_send_head(sk);
> 
>   return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
> }
> 
> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
> Signed-off-by: Eric Dumazet 
> Reported-by: Andrew Prout 
> Tested-by: Andrew Prout 
> Tested-by: Jonathan Lemon 
> Tested-by: Michal Kubecek 
> Acked-by: Neal Cardwell 
> Acked-by: Yuchung Cheng 
> Acked-by: Christoph Paasch 
> Cc: Jonathan Looney 
> Signed-off-by: David S. Miller 
> Signed-off-by: Matthieu Baerts 
> Signed-off-by: Sasha Levin 
> ---
>  include/net/tcp.h | 17 +
>  net/ipv4/tcp_output.c | 11 ++-
>  2 files changed, 27 insertions(+), 1 deletion(-)

I am sorry to bother you again with the recent modifications in
tcp_fragment() but it seems we have a new kernel BUG with this patch in
v4.14.

Here is the call trace.


[26665.934461] [ cut here ]
[26665.936152] kernel BUG at ./include/linux/skbuff.h:1406!
[26665.937941] invalid opcode:  [#1] SMP PTI
[26665.977252] Call Trace:
[26665.978267]  
[26665.979163]  tcp_fragment+0x9c/0x2cf
[26665.980562]  tcp_write_xmit+0x68f/0x988
[26665.982031]  __tcp_push_pending_frames+0x3b/0xa0
[26665.983684]  tcp_data_snd_check+0x2a/0xc8
[26665.985196]  tcp_rcv_established+0x2a8/0x30d
[26665.986736]  tcp_v4_do_rcv+0xb2/0x158
[26665.988140]  tcp_v4_rcv+0x692/0x956
[26665.989533]  ip_local_deliver_finish+0xeb/0x169
[26665.991250]  __netif_receive_skb_core+0x51c/0x582
[26665.993028]  ? inet_gro_receive+0x239/0x247
[26665.994581]  netif_receive_skb_internal+0xab/0xc6
[26665.996340]  napi_gro_receive+0x8a/0xc0
[26665.997790]  receive_buf+0x9a1/0x9cd
[26665.999232]  ? load_balance+0x17a/0x7b7
[2.000711]  ? vring_unmap_one+0x18/0x61
[2.002196]  ? detach_buf+0x60/0xfa
[2.003526]  virtnet_poll+0x128/0x1e1
[2.004860]  net_rx_action+0x12a/0x2b1
[2.006309]  __do_softirq+0x11c/0x26b
[2.007734]  ? handle_irq_event+0x44/0x56
[2.009275]  irq_exit+0x61/0xa0
[2.010511]  do_IRQ+0x9d/0xbb
[2.011685]  common_interrupt+0x85/0x85


We are doing the tests with the MPTCP stack[1], the error might come
from there but the call trace is free of MPTCP functions. We are still
working on having a reproducible setup with MPTCP before doing the same
without MPTCP but please see below the analysis we did so far with some
questions.

[1] https://github.com/multipath-tcp/mptcp/tree/mptcp_v0.94

> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 0b477a1e11770..7994e569644e0 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1688,6 +1688,23 @@ static inline void tcp_check_send_head(struct sock 
> *sk, struct sk_buff *skb_unli
>   tcp_sk(sk)->highest_sack = NULL;
>  }
>  
> +static inline struct sk_buff *tcp_rtx_queue_head(const struct sock *sk)
> +{
> + struct sk_buff *skb = tcp_write_queue_head(sk);
> +
> + if (skb == tcp_send_head(sk))
> + skb = NULL;
> +
> + return skb;
> +}
> +
> +static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
> +{
> + struct sk_buff *skb = tcp_send_head(sk);
> +
> + return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
> +}
> +
>  static inline void __tcp_add_write_queue_tail(struct sock *sk, struct 
> sk_buff *skb)
>  {
>   __skb_queue_tail(>sk_write_queue, skb);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index a5960b9b6741c..a99086bf26eaf 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1264,6 +1264,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, 
> u32 len,
>   struct tcp_sock *tp = tcp_sk(sk);
>   struct sk_buff *buff;
>   int nsize, old_factor;
> + long limit;
>   int nlen;
>   u8 flags;
>  
> @@ -1274,7 +1275,15 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, 
> u32 len,
>   if (nsize < 0)
>   nsize = 0;
>  
> - if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf + 0x2)) {
> + /* tcp_sendmsg() can overshoot sk_wmem_queued by one full size skb.
> +  * We need some allowance 

Re: [PATCH v5] ata/pata_buddha: Probe via modalias instead of initcall

2019-08-20 Thread Max Staudt
Hi Bartlomiej,

On 08/20/2019 02:06 PM, Bartlomiej Zolnierkiewicz wrote:
> WARNING: line over 80 characters
> #354: FILE: drivers/ata/pata_buddha.c:287:
> +while ((z = 
> zorro_find_device(ZORRO_PROD_INDIVIDUAL_COMPUTERS_X_SURF, z))) {

I see no good way to shorten this one. I think it's obvious enough to allow 
overflowing by a few chars - do you agree?


Max


Re: [PATCH RFC] dt-bindings: regulator: define a mux regulator

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 10:25 AM Uwe Kleine-König
 wrote:
>
> A mux regulator is used to provide current on one of several outputs. It
> might look as follows:
>
>   ,.
> -- -- -- -- -- -- -- --   `'
>
> Depending on which address is encoded on the three address inputs A0, A1
> and A2 the current provided on IN is provided on one of the eight
> outputs.
>
> What is new here is that the binding makes use of a #regulator-cells
> property. This uses the approach known from other bindings (e.g. gpio)
> to allow referencing all eight outputs with phandle arguments. This
> requires an extention in of_get_regulator to use a new variant of
> of_parse_phandle_with_args that has a cell_count_default parameter that
> is used in absence of a $cell_name property. Even if we'd choose to
> update all regulator-bindings to add #regulator-cells = <0>; we still
> needed something to implement compatibility to the currently defined
> bindings.
>
> Signed-off-by: Uwe Kleine-König 
> ---
> Hello,
>
> the obvious alternative is to add (here) eight subnodes to represent the
> eight outputs. This is IMHO less pretty, but wouldn't need to introduce
> #regulator-cells.

I'm okay with #regulator-cells approach.

>
> Apart from reg = <..> and a phandle there is (I think) nothing that
> needs to be specified in the subnodes because all properties of an
> output (apart from the address) apply to all outputs.
>
> What do you think?
>
> Best regards
> Uwe
>
>  .../bindings/regulator/mux-regulator.yaml | 52 +++
>  1 file changed, 52 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/regulator/mux-regulator.yaml
>
> diff --git a/Documentation/devicetree/bindings/regulator/mux-regulator.yaml 
> b/Documentation/devicetree/bindings/regulator/mux-regulator.yaml
> new file mode 100644
> index ..f06dbb969090
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/regulator/mux-regulator.yaml
> @@ -0,0 +1,52 @@
> +# SPDX-License-Identifier: GPL-2.0

(GPL-2.0-only OR BSD-2-Clause) is preferred.


> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/regulator/mux-regulator.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: MUX regulators
> +
> +properties:
> +  compatible:
> +const: XXX,adb708

? I assume you will split this into a common and specific schemas. I
suppose there could be differing ways to control the mux just like all
other muxes.

> +
> +  enable-gpios:
> +maxItems: 1
> +
> +  address-gpios:
> +description: Array of typically three GPIO pins used to select the
> +  regulator's output. The least significant address GPIO must be listed
> +  first. The others follow in order of significance.
> +minItems: 1
> +
> +  "#regulator-cells":

How is this not required?

> +const: 1
> +
> +  regulator-name:
> +description: A string used to construct the sub regulator's names
> +$ref: "/schemas/types.yaml#/definitions/string"
> +
> +  supply:
> +description: input supply
> +
> +required:
> +  - compatible
> +  - regulator-name
> +  - supply
> +
> +
> +examples:
> +  - |
> +mux-regulator {
> +  compatible = "regulator-mux";
> +
> +  regulator-name = "blafasel";
> +
> +  supply = <_regulator>;
> +
> +  enable-gpios = < 5 GPIO_ACTIVE_HIGH>;
> +  address-gpios = < 2 GPIO_ACTIVE_HIGH>,
> +< 3 GPIO_ACTIVE_HIGH>,
> +< 4 GPIO_ACTIVE_HIGH>,
> +};
> +...
> --
> 2.20.1
>


Re: [PATCH V40 19/29] lockdown: Lock down module params that specify hardware parameters (eg. ioport)

2019-08-20 Thread Jessica Yu

+++ Matthew Garrett [19/08/19 17:17 -0700]:

From: David Howells 

Provided an annotation for module parameters that specify hardware
parameters (such as io ports, iomem addresses, irqs, dma channels, fixed
dma buffers and other types).

Suggested-by: Alan Cox 
Signed-off-by: David Howells 
Signed-off-by: Matthew Garrett 
Reviewed-by: Kees Cook 
Cc: Jessica Yu 
Signed-off-by: James Morris 


Acked-by: Jessica Yu 

Thanks!


---
include/linux/security.h |  1 +
kernel/params.c  | 21 -
security/lockdown/lockdown.c |  1 +
3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index b4a85badb03a..1a3404f9c060 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -113,6 +113,7 @@ enum lockdown_reason {
LOCKDOWN_ACPI_TABLES,
LOCKDOWN_PCMCIA_CIS,
LOCKDOWN_TIOCSSERIAL,
+   LOCKDOWN_MODULE_PARAMETERS,
LOCKDOWN_INTEGRITY_MAX,
LOCKDOWN_CONFIDENTIALITY_MAX,
};
diff --git a/kernel/params.c b/kernel/params.c
index cf448785d058..8e56f8b12d8f 100644
--- a/kernel/params.c
+++ b/kernel/params.c
@@ -12,6 +12,7 @@
#include 
#include 
#include 
+#include 

#ifdef CONFIG_SYSFS
/* Protects all built-in parameters, modules use their own param_lock */
@@ -96,13 +97,19 @@ bool parameq(const char *a, const char *b)
return parameqn(a, b, strlen(a)+1);
}

-static void param_check_unsafe(const struct kernel_param *kp)
+static bool param_check_unsafe(const struct kernel_param *kp)
{
+   if (kp->flags & KERNEL_PARAM_FL_HWPARAM &&
+   security_locked_down(LOCKDOWN_MODULE_PARAMETERS))
+   return false;
+
if (kp->flags & KERNEL_PARAM_FL_UNSAFE) {
pr_notice("Setting dangerous option %s - tainting kernel\n",
  kp->name);
add_taint(TAINT_USER, LOCKDEP_STILL_OK);
}
+
+   return true;
}

static int parse_one(char *param,
@@ -132,8 +139,10 @@ static int parse_one(char *param,
pr_debug("handling %s with %p\n", param,
params[i].ops->set);
kernel_param_lock(params[i].mod);
-   param_check_unsafe([i]);
-   err = params[i].ops->set(val, [i]);
+   if (param_check_unsafe([i]))
+   err = params[i].ops->set(val, [i]);
+   else
+   err = -EPERM;
kernel_param_unlock(params[i].mod);
return err;
}
@@ -553,8 +562,10 @@ static ssize_t param_attr_store(struct module_attribute 
*mattr,
return -EPERM;

kernel_param_lock(mk->mod);
-   param_check_unsafe(attribute->param);
-   err = attribute->param->ops->set(buf, attribute->param);
+   if (param_check_unsafe(attribute->param))
+   err = attribute->param->ops->set(buf, attribute->param);
+   else
+   err = -EPERM;
kernel_param_unlock(mk->mod);
if (!err)
return len;
diff --git a/security/lockdown/lockdown.c b/security/lockdown/lockdown.c
index 771c77f9c04a..0fa434294667 100644
--- a/security/lockdown/lockdown.c
+++ b/security/lockdown/lockdown.c
@@ -28,6 +28,7 @@ static char *lockdown_reasons[LOCKDOWN_CONFIDENTIALITY_MAX+1] 
= {
[LOCKDOWN_ACPI_TABLES] = "modifying ACPI tables",
[LOCKDOWN_PCMCIA_CIS] = "direct PCMCIA CIS storage",
[LOCKDOWN_TIOCSSERIAL] = "reconfiguration of serial port IO",
+   [LOCKDOWN_MODULE_PARAMETERS] = "unsafe module parameters",
[LOCKDOWN_INTEGRITY_MAX] = "integrity",
[LOCKDOWN_CONFIDENTIALITY_MAX] = "confidentiality",
};
--
2.23.0.rc1.153.gdeed80330f-goog



Re: [PATCH v2 1/8] arm64: dts: qcom: sm8150: add base dts file

2019-08-20 Thread Vinod Koul
On 20-08-19, 19:03, Amit Kucheria wrote:
> On Tue, Aug 20, 2019 at 12:14 PM Vinod Koul  wrote:
> >
> > This add base DTS file with cpu, psci, firmware, clock, tlmm and
> > spmi nodes which enables boot to console
> >
> > Signed-off-by: Vinod Koul 
> > ---
> >  arch/arm64/boot/dts/qcom/sm8150.dtsi | 305 +++
> >  1 file changed, 305 insertions(+)
> >  create mode 100644 arch/arm64/boot/dts/qcom/sm8150.dtsi
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi 
> > b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > new file mode 100644
> > index ..d9dc95f851b7
> > --- /dev/null
> > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > @@ -0,0 +1,305 @@
> > +// SPDX-License-Identifier: BSD-3-Clause
> 
> This is fine.
> 
> > +// Copyright (c) 2017-2019, The Linux Foundation. All rights reserved.
> > +// Copyright (c) 2019, Linaro Limited
> 
> These two lines should be in /* */

Yeah I made it same as previous, lets do right style.

> > +   timer {
> > +   compatible = "arm,armv8-timer";
> > +   interrupts = ,
> > +,
> > +,
> > +;
> 
> Any particular reason why these are defined in this order - 1, 2, 3, 0?

Copied from downstream :)

-- 
~Vinod


Re: [PATCH] x86/mm/pti: in pti_clone_pgtable() don't increase addr by PUD_SIZE

2019-08-20 Thread Song Liu



> On Aug 20, 2019, at 9:05 AM, Song Liu  wrote:
> 
> 
> 
>> On Aug 20, 2019, at 7:18 AM, Dave Hansen  wrote:
>> 
>> On 8/20/19 7:14 AM, Song Liu wrote:
 *But*, that shouldn't get hit on a Skylake CPU since those have PCIDs
 and shouldn't have a global kernel image.  Could you confirm whether
 PCIDs are supported on this CPU?
>>> Yes, pcid is listed in /proc/cpuinfo. 
>> 
>> So what's going on?  Could you confirm exactly which pti_clone_pgtable()
>> is causing you problems?  Do you have a theory as to why this manifests
>> as a performance problem rather than a functional one?
>> 
>> A diff of these:
>> 
>>  /sys/kernel/debug/page_tables/current_user
>>  /sys/kernel/debug/page_tables/current_kernel
>> 
>> before and after your patch might be helpful.
> 
> I believe the difference is from the following entries (7 PMDs)
> 
> Before the patch:
> 
> current_kernel:   0x8100-0x81e04000   14352K 
> ro GLB x  pte
> efi:  0x8100-0x81e04000   14352K ro   
>   GLB x  pte
> kernel:   0x8100-0x81e04000   14352K 
> ro GLB x  pte
> 
> 
> After the patch:
> 
> current_kernel:   0x8100-0x81e0  14M 
> ro PSE GLB x  pmd
> efi:  0x8100-0x81e0  14M ro   
>   PSE GLB x  pmd
> kernel:   0x8100-0x81e0  14M 
> ro PSE GLB x  pmd
> 
> current_kernel and kernel show same data though. 

A little more details on how I got here.

We use huge page for hot text and thus reduces iTLB misses. As we 
benchmark 5.2 based kernel (vs. 4.16 based), we found ~2.5x more 
iTLB misses. 

To figure out the issue, I use a debug patch that dumps page table for 
a pid. The following are information from the workload pid. 


For the 4.16 based kernel:

host-4.16 # grep "x  pmd" /sys/kernel/debug/page_tables/dump_pid
0x0060-0x00e0   8M USR ro PSE x 
 pmd
0x81a0-0x81c0   2M ro PSE x 
 pmd


For the 5.2 based kernel before this patch:

host-5.2-before # grep "x  pmd" /sys/kernel/debug/page_tables/dump_pid
0x0060-0x00e0   8M USR ro PSE x 
 pmd


The 8MB text in pmd is from user space. 4.16 kernel has 1 pmd for the
irq entry table; while 4.16 kernel doesn't have it. 


For the 5.2 based kernel after this patch:

host-5.2-after # grep "x  pmd" /sys/kernel/debug/page_tables/dump_pid
0x0060-0x00e0   8M USR ro PSE x 
 pmd
0x8100-0x81e0  14M ro PSE GLB x 
 pmd


So after this patch, the 5.2 based kernel has 7 PMDs instead of 1 PMD 
in 4.16 kernel. This further reduces iTLB miss rate 

Thanks,
Song

[PATCH] mm/balloon_compaction: suppress allocation warnings

2019-08-20 Thread Nadav Amit
There is no reason to print warnings when balloon page allocation fails,
as they are expected and can be handled gracefully.  Since VMware
balloon now uses balloon-compaction infrastructure, and suppressed these
warnings before, it is also beneficial to suppress these warnings to
keep the same behavior that the balloon had before.

Cc: Jason Wang 
Signed-off-by: Nadav Amit 
---
 mm/balloon_compaction.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 798275a51887..26de020aae7b 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -124,7 +124,8 @@ EXPORT_SYMBOL_GPL(balloon_page_list_dequeue);
 struct page *balloon_page_alloc(void)
 {
struct page *page = alloc_page(balloon_mapping_gfp_mask() |
-  __GFP_NOMEMALLOC | __GFP_NORETRY);
+  __GFP_NOMEMALLOC | __GFP_NORETRY |
+  __GFP_NOWARN);
return page;
 }
 EXPORT_SYMBOL_GPL(balloon_page_alloc);
-- 
2.19.1



Re: [PATCH 2/6] dt-bindings: net: sun8i-a83t-emac: Add phy-io-supply property

2019-08-20 Thread Ondřej Jirman
On Tue, Aug 20, 2019 at 11:20:22AM -0500, Rob Herring wrote:
> On Tue, Aug 20, 2019 at 9:53 AM  wrote:
> >
> > From: Ondrej Jirman 
> >
> > Some PHYs require separate power supply for I/O pins in some modes
> > of operation. Add phy-io-supply property, to allow enabling this
> > power supply.
> 
> Perhaps since this is new, such phys should have *-supply in their nodes.

Yes, I just don't understand, since external ethernet phys are so common,
and they require power, how there's no fairly generic mechanism for this
already in the PHY subsystem, or somewhere?

It looks like other ethernet mac drivers also implement supplies on phys
on the EMAC nodes. Just grep phy-supply through dt-bindings/net.

Historical reasons, or am I missing something? It almost seems like I must
be missing something, since putting these properties to phy nodes
seems so obvious.

thank you and regards,
Ondrej

> >
> > Signed-off-by: Ondrej Jirman 
> > ---
> >  .../devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml| 4 
> >  1 file changed, 4 insertions(+)


Re: [PATCH 7/7] bug: Move WARN_ON() "cut here" into exception handler

2019-08-20 Thread Kees Cook
On Tue, Aug 20, 2019 at 12:58:49PM +0200, Christophe Leroy wrote:
> Le 20/08/2019 à 12:06, Peter Zijlstra a écrit :
> > On Mon, Aug 19, 2019 at 04:41:11PM -0700, Kees Cook wrote:
> > 
> > > diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> > > index 588dd59a5b72..da471fcc5487 100644
> > > --- a/include/asm-generic/bug.h
> > > +++ b/include/asm-generic/bug.h
> > > @@ -10,6 +10,7 @@
> > >   #define BUGFLAG_WARNING (1 << 0)
> > >   #define BUGFLAG_ONCE(1 << 1)
> > >   #define BUGFLAG_DONE(1 << 2)
> > > +#define BUGFLAG_PRINTK   (1 << 3)
> > >   #define BUGFLAG_TAINT(taint)((taint) << 8)
> > >   #define BUG_GET_TAINT(bug)  ((bug)->flags >> 8)
> > >   #endif
> > 
> > > diff --git a/lib/bug.c b/lib/bug.c
> > > index 1077366f496b..6c22e8a6f9de 100644
> > > --- a/lib/bug.c
> > > +++ b/lib/bug.c
> > > @@ -181,6 +181,15 @@ enum bug_trap_type report_bug(unsigned long bugaddr, 
> > > struct pt_regs *regs)
> > >   }
> > >   }
> > > + /*
> > > +  * BUG() and WARN_ON() families don't print a custom debug message
> > > +  * before triggering the exception handler, so we must add the
> > > +  * "cut here" line now. WARN() issues its own "cut here" before the
> > > +  * extra debugging message it writes before triggering the handler.
> > > +  */
> > > + if ((bug->flags & BUGFLAG_PRINTK) == 0)
> > > + printk(KERN_DEFAULT CUT_HERE);
> > 
> > I'm not loving that BUGFLAG_PRINTK name, BUGFLAG_CUT_HERE makes more
> > sense to me.

That's fine -- easy rename. :)

> Actually it would be BUGFLAG_NO_CUT_HERE then, otherwise all arches not
> using the generic macros will have to add the flag to get the "cut here"
> line.

I am testing for the lack of the flag (so that only the
CONFIG_GENERIC_BUG with __WARN_FLAGS case needs to set it). I was
thinking of the flag to mean "this reporting flow has already issued
cut-here". It sounds like it would be more logical to have it named
BUGFLAG_NO_CUT_HERE to mean "do not issue a cut-here; it has already
happened"? I will update the patch.

Thanks!

-- 
Kees Cook


Re: [PATCH v2 0/2] Simplify mtty driver and mdev core

2019-08-20 Thread Cornelia Huck
On Tue, 20 Aug 2019 11:25:05 +
Parav Pandit  wrote:

> > -Original Message-
> > From: Christophe de Dinechin 
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > 
> > 
> > Parav Pandit writes:
> >   
> > > + Dave.
> > >
> > > Hi Jiri, Dave, Alex, Kirti, Cornelia,
> > >
> > > Please provide your feedback on it, how shall we proceed?
> > >
> > > Hence, I would like to discuss below options.
> > >
> > > Option-1: mdev index
> > > Introduce an optional mdev index/handle as u32 during mdev create time.
> > > User passes mdev index/handle as input.
> > >
> > > phys_port_name=mIndex=m%u
> > > mdev_index will be available in sysfs as mdev attribute for udev to name 
> > > the  
> > mdev's netdev.  
> > >
> > > example mdev create command:
> > > UUID=$(uuidgen)
> > > echo $UUID index=10 >
> > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > example netdevs:
> > > repnetdev=ens2f0_m10  /*ens2f0 is parent PF's netdevice */
> > > mdev_netdev=enm10
> > >
> > > Pros:
> > > 1. mdevctl and any other existing tools are unaffected.
> > > 2. netdev stack, ovs and other switching platforms are unaffected.
> > > 3. achieves unique phys_port_name for representor netdev 4. achieves
> > > unique mdev eth netdev name for the mdev using udev/systemd extension.
> > > 5. Aligns well with mdev and netdev subsystem and similar to existing 
> > > sriov  
> > bdf's.  
> > >
> > > Option-2: shorter mdev name
> > > Extend mdev to have shorter mdev device name in addition to UUID.
> > > such as 'foo', 'bar'.
> > > Mdev will continue to have UUID.

I fail to understand how 'uses uuid' and 'allow shorter device name'
are supposed to play together?

> > > phys_port_name=mdev_name
> > >
> > > Pros:
> > > 1. All same as option-1, except mdevctl needs upgrade for newer usage.
> > > It is common practice to upgrade iproute2 package along with the kernel.
> > > Similar practice to be done with mdevctl.
> > > 2. Newer users of mdevctl who wants to work with non_UUID names, will use 
> > >  
> > newer mdevctl/tools.  
> > > Cons:
> > > 1. Dual naming scheme of mdev might affect some of the existing tools.
> > > It's unclear how/if it actually affects.
> > > mdevctl [2] is very recently developed and can be enhanced for dual 
> > > naming  
> > scheme.  

The main problem is not tools we know about (i.e. mdevctl), but those we
don't know about.

IOW, this (and the IFNAMESIZ change, which seems even worse) are the
options I would not want at all.

> > >
> > > Option-3: mdev uuid alias
> > > Instead of shorter mdev name or mdev index, have alpha-numeric name  
> > alias.  
> > > Alias is an optional mdev sysfs attribute such as 'foo', 'bar'.
> > > example mdev create command:
> > > UUID=$(uuidgen)
> > > echo $UUID alias=foo >
> > > /sys/class/net/ens2f0/mdev_supported_types/mlx5_core_mdev/create
> > > example netdevs:
> > > examle netdevs:
> > > repnetdev = ens2f0_mfoo
> > > mdev_netdev=enmfoo
> > >
> > > Pros:
> > > 1. All same as option-1.
> > > 2. Doesn't affect existing mdev naming scheme.
> > > Cons:
> > > 1. Index scheme of option-1 is better which can number large number of  
> > mdevs with fewer characters, simplifying the management tool.
> > 
> > I believe that Alex pointed out another "Cons" to all three options, which 
> > is that
> > it forces user-space to resolve potential race conditions when creating an 
> > index
> > or short name or alias.
> >   
> This race condition exists for at least two subsystems that I know of, i.e. 
> netdev and rdma.
> If a device with a given name exists, subsystem returns error.
> When user space gets error code EEXIST, and it can picks up different 
> identifier(s).

If you decouple device creation and setting the alias/index, you make
the issue visible and thus much more manageable.

> 
> > Also, what happens if `index=10` is not provided on the command-line?
> > Does that make the device unusable for your purpose?  
> Yes, it is unusable to an extent.
> Currently we have DEVLINK_PORT_FLAVOUR_PCI_VF in include/uapi/linux/devlink.h
> Similar to it, we need to have DEVLINK_PORT_FLAVOUR_MDEV for mdev eswitch 
> ports.
> This port flavour needs to generate phys_port_name(). This should be user 
> parameter driven.
> Because representor netdevice name is generated based on this parameter.

I'm also unsure how the extra parameter is supposed to work; writing it
to the create attribute does not sound right.

mdevctl supports setting additional parameters on an already created
device (see the examples provided for vfio-ap), so going that route
would actually work out of the box from the tooling side.

What you would need is some kind of synchronization/locking to make
sure that you only link up to the other device after the extra
attribute has been set and that you don't allow to change it as long as
it is associated with the other side. I do not know enough about the
actual devices to suggest something here; if you need userspace
cooperation, maybe 

Re: [PATCH] powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB

2019-08-20 Thread Greg Kroah-Hartman
On Fri, Aug 16, 2019 at 09:14:12AM +0200, Greg Kroah-Hartman wrote:
> On Fri, Aug 16, 2019 at 11:42:22AM +1000, Michael Ellerman wrote:
> > Greg Kroah-Hartman  writes:
> > > On Thu, Aug 15, 2019 at 02:55:42PM +1000, Alastair D'Silva wrote:
> > >> From: Alastair D'Silva 
> > >> 
> > >> Heads Up: This patch cannot be submitted to Linus's tree, as the affected
> > >> assembler functions have already been converted to C.
> > 
> > That was done in upstream commit:
> > 
> > 22e9c88d486a ("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
> > 
> > Which is a larger change that we don't want to backport. This patch is a
> > minimal fix for stable trees.
> > 
> > 
> > >> When calling flush_(inval_)dcache_range with a size >4GB, we were masking
> > >> off the upper 32 bits, so we would incorrectly flush a range smaller
> > >> than intended.
> > >> 
> > >> This patch replaces the 32 bit shifts with 64 bit ones, so that
> > >> the full size is accounted for.
> > >> 
> > >> Signed-off-by: Alastair D'Silva 
> > >> ---
> > >>  arch/powerpc/kernel/misc_64.S | 4 ++--
> > >>  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > Acked-by: Michael Ellerman 
> > 
> > > 
> > >
> > > This is not the correct way to submit patches for inclusion in the
> > > stable kernel tree.  Please read:
> > > 
> > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > for how to do this properly.
> > >
> > > 
> > 
> > Hi Greg,
> > 
> > This is "option 3", submit the patch directly, and the patch "deviates
> > from the original upstream patch" because the upstream patch was a
> > wholesale conversion from asm to C.
> > 
> > This patch applies cleanly to v4.14 and v4.19.
> > 
> > The change log should have mentioned which upstream patch it is not a
> > backport of, is there anything else we should have done differently to
> > avoid the formletter bot :)
> 
> That is exactly what you should have done.  It needs to be VERY explicit
> as to why this is being submitted different from what upstream did, and
> to what trees it needs to go to and who is going to be responsible for
> when it breaks.  And it will break :)

And it needs to be done before I can apply it, I've dropped this thread
from my queue now.

thanks,

greg k-h


RE: [PATCH] net/ncsi: add control packet payload to NC-SI commands from netlink

2019-08-20 Thread Justin.Lee1
Hi Ben, 

> Hi Justin, 
> 
> > Hi Ben,
> >
> > I have similar fix locally with different approach as the command handler 
> > may have some expectation for those byes.
> > We can use NCSI_PKT_CMD_OEM handler as it only copies data based on the 
> > payload length.
> 
> Great! Yes I was thinking the same, we just need some way to take data 
> payload sent from netlink message and sent it over NC-SI.
> 
> >
> > diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c index 
> > 5c3fad8..3b01f65 100644
> > --- a/net/ncsi/ncsi-cmd.c
> > +++ b/net/ncsi/ncsi-cmd.c
> > @@ -309,14 +309,19 @@ static struct ncsi_request *ncsi_alloc_command(struct 
> > ncsi_cmd_arg *nca)
> >  
> >  int ncsi_xmit_cmd(struct ncsi_cmd_arg *nca)  {
> > + struct ncsi_cmd_handler *nch = NULL;
> > struct ncsi_request *nr;
> > + unsigned char type;
> > struct ethhdr *eh;
> > -   struct ncsi_cmd_handler *nch = NULL;
> > int i, ret;
> >  
> > + if (nca->req_flags == NCSI_REQ_FLAG_NETLINK_DRIVEN)
> > + type = NCSI_PKT_CMD_OEM;
> > + else
> > + type = nca->type;
> > /* Search for the handler */
> > for (i = 0; i < ARRAY_SIZE(ncsi_cmd_handlers); i++) {
> > -   if (ncsi_cmd_handlers[i].type == nca->type) {
> > + if (ncsi_cmd_handlers[i].type == type) {
> > if (ncsi_cmd_handlers[i].handler)
> > nch = _cmd_handlers[i];
> > else
> >
> 
> So in this case NCSI_PKT_CMD_OEM would be the default handler for all NC-SI 
> command over netlink  (standard and OEM), correct?
Yes, that is correct. The handler for NCSI_PKT_CMD_OEM command is generic.

> Should we rename this to something like NCSI_PKT_CMD_GENERIC for clarity 
> perhaps?  Do you plan to upstream this patch?  
NCSI_PKT_CMD_OEM is a real command type and it is defined by the NC-SI 
specific. 
We can add comments to indicate that we use the generic command handler from 
NCSI_PKT_CMD_OEM command.

Does the change work for you? If so, I will prepare the patch.

> 
> 
> Also do you have local patch to support NCSI_PKT_CMD_PLDM and the PLDM over 
> NC-SI commands defined here 
> (https://www.dmtf.org/sites/default/files/NC-SI_1.2_PLDM_Support_over_RBT_Commands_Proposal.pdf)?
> If not I can send my local changes - but I think we can use the same 
> NCSI_PKT_CMD_OEM handler to transport PLDM payload over NC-SI.
> What do you think?
No, I don't have any change currently to support these commands. It should be 
very similar to NCSI_PKT_CMD_OEM handler with some minor modification.

> 
> (CC Deepak as I think once this is in place we can use pldmtool to send basic 
> PLDM payloads over NC-SI)
> 
> Regards,
> -Ben

Thanks,
Justin



Re: [PATCH v2 4/4] leds: lm3532: Add full scale current configuration

2019-08-20 Thread Pavel Machek
Hi!

> >No need to move ctrl_brt_pointer... to keep order consistent with docs.
> 
> OK I will reset the patches and get rid of that change.  I think this got
> moved when I applied the v1 patch.
> 
> 
> >>+   fs_current_val = led->full_scale_current - LM3532_FS_CURR_MIN /
> >>+LM3532_FS_CURR_STEP;
> >The computation is wrong ... needs () AFAICT.
> 
> Hmm. Doesn't order of operations take precedence?
> 
> I will add the () unless checkpatch cribs about them

I may be misunderstanding. What do you expect the computation to be? /
has higher priority than -, right? Can you test it provides expected
results?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v8 08/20] adfs: Fill in max and min timestamps in sb

2019-08-20 Thread Matthew Wilcox
On Sun, Aug 18, 2019 at 09:58:05AM -0700, Deepa Dinamani wrote:
> Note that the min timestamp is assumed to be
> 01 Jan 1970 00:00:00 (Unix epoch). This is consistent
> with the way we convert timestamps in adfs_adfs2unix_time().

That's not actually correct.  RISC OS timestamps are centiseconds since
1900 stored in 5 bytes.

> Signed-off-by: Deepa Dinamani 
> ---
>  fs/adfs/adfs.h  | 13 +
>  fs/adfs/inode.c |  8 ++--
>  fs/adfs/super.c |  2 ++
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/adfs/adfs.h b/fs/adfs/adfs.h
> index b7e844d2f321..dca8b23aa43f 100644
> --- a/fs/adfs/adfs.h
> +++ b/fs/adfs/adfs.h
> @@ -3,6 +3,19 @@
>  #include 
>  #include 
>  
> +/*
> + * 01 Jan 1970 00:00:00 (Unix epoch) as seconds since
> + * 01 Jan 1900 00:00:00 (RISC OS epoch)
> + */
> +#define RISC_OS_EPOCH_DELTA 2208988800LL
> +
> +/*
> + * Convert 40 bit centi seconds to seconds
> + * since 01 Jan 1900 00:00:00 (RISC OS epoch)
> + * The result is 2248-06-03 06:57:57 GMT
> + */
> +#define ADFS_MAX_TIMESTAMP ((0xFFLL / 100) - RISC_OS_EPOCH_DELTA)
> +
>  /* Internal data structures for ADFS */
>  
>  #define ADFS_FREE_FRAG0
> diff --git a/fs/adfs/inode.c b/fs/adfs/inode.c
> index 124de75413a5..41eca1c451dc 100644
> --- a/fs/adfs/inode.c
> +++ b/fs/adfs/inode.c
> @@ -167,11 +167,7 @@ static void
>  adfs_adfs2unix_time(struct timespec64 *tv, struct inode *inode)
>  {
>   unsigned int high, low;
> - /* 01 Jan 1970 00:00:00 (Unix epoch) as nanoseconds since
> -  * 01 Jan 1900 00:00:00 (RISC OS epoch)
> -  */
> - static const s64 nsec_unix_epoch_diff_risc_os_epoch =
> - 22089888000LL;
> + static const s64 nsec_unix_epoch_diff_risc_os_epoch = 
> RISC_OS_EPOCH_DELTA * NSEC_PER_SEC;
>   s64 nsec;
>  
>   if (!adfs_inode_is_stamped(inode))
> @@ -216,7 +212,7 @@ adfs_unix2adfs_time(struct inode *inode, unsigned int 
> secs)
>   if (adfs_inode_is_stamped(inode)) {
>   /* convert 32-bit seconds to 40-bit centi-seconds */
>   low  = (secs & 255) * 100;
> - high = (secs / 256) * 100 + (low >> 8) + 0x336e996a;
> + high = (secs / 256) * 100 + (low >> 8) + 
> (RISC_OS_EPOCH_DELTA*100/256);
>  
>   ADFS_I(inode)->loadaddr = (high >> 24) |
>   (ADFS_I(inode)->loadaddr & ~0xff);
> diff --git a/fs/adfs/super.c b/fs/adfs/super.c
> index 65b04ebb51c3..f074fe7d7158 100644
> --- a/fs/adfs/super.c
> +++ b/fs/adfs/super.c
> @@ -463,6 +463,8 @@ static int adfs_fill_super(struct super_block *sb, void 
> *data, int silent)
>   asb->s_map_size = dr->nzones | (dr->nzones_high << 8);
>   asb->s_map2blk  = dr->log2bpmb - dr->log2secsize;
>   asb->s_log2sharesize= dr->log2sharesize;
> + sb->s_time_min  = 0;
> + sb->s_time_max  = ADFS_MAX_TIMESTAMP;
>  
>   asb->s_map = adfs_read_map(sb, dr);
>   if (IS_ERR(asb->s_map)) {
> -- 
> 2.17.1
> 


Re: [PATCH v4 1/2] dt-bindings: arm: imx: add imx8mq nitrogen support

2019-08-20 Thread Dafna Hirschfeld
On Mon, 2019-08-19 at 14:08 -0500, Rob Herring wrote:
> On Mon, Aug 19, 2019 at 12:26 PM Dafna Hirschfeld
>  wrote:
> > From: Gary Bisson 
> > 
> > The Nitrogen8M is an ARM based single board computer (SBC)
> > designed to leverage the full capabilities of NXP’s i.MX8M
> > Quad processor.
> > 
> > Signed-off-by: Gary Bisson 
> > Signed-off-by: Troy Kisky 
> > [Dafna: porting vendor's code to mainline]
> > Signed-off-by: Dafna Hirschfeld 
> > ---
> >  Documentation/devicetree/bindings/arm/fsl.yaml | 1 +
> >  1 file changed, 1 insertion(+)
> 
> Please add acks/reviewed-bys when posting new versions.
> 
Hi,
Thank you for the remark, I forgot to add it. I will add it in the
next.
Regards,
Dafna Hirschfeld

> Rob



[PATCH 2/3] ASoC: mchp-i2s-mcc: Fix unprepare of GCLK

2019-08-20 Thread Codrin Ciubotariu
If hw_free() gets called after hw_params(), GCLK remains prepared,
preventing further use of it. This patch fixes this by unpreparing the
clock in hw_free() or if hw_params() gets an error.

Fixes: 7e0cdf545a55 ("ASoC: mchp-i2s-mcc: add driver for I2SC Multi-Channel 
Controller")
Signed-off-by: Codrin Ciubotariu 
---
 sound/soc/atmel/mchp-i2s-mcc.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/sound/soc/atmel/mchp-i2s-mcc.c b/sound/soc/atmel/mchp-i2s-mcc.c
index 8272915fa09b..ab7d5f98e759 100644
--- a/sound/soc/atmel/mchp-i2s-mcc.c
+++ b/sound/soc/atmel/mchp-i2s-mcc.c
@@ -670,8 +670,13 @@ static int mchp_i2s_mcc_hw_params(struct snd_pcm_substream 
*substream,
}
 
ret = regmap_write(dev->regmap, MCHP_I2SMCC_MRA, mra);
-   if (ret < 0)
+   if (ret < 0) {
+   if (dev->gclk_use) {
+   clk_unprepare(dev->gclk);
+   dev->gclk_use = 0;
+   }
return ret;
+   }
return regmap_write(dev->regmap, MCHP_I2SMCC_MRB, mrb);
 }
 
@@ -710,9 +715,13 @@ static int mchp_i2s_mcc_hw_free(struct snd_pcm_substream 
*substream,
regmap_write(dev->regmap, MCHP_I2SMCC_CR, MCHP_I2SMCC_CR_CKDIS);
 
if (dev->gclk_running) {
-   clk_disable_unprepare(dev->gclk);
+   clk_disable(dev->gclk);
dev->gclk_running = 0;
}
+   if (dev->gclk_use) {
+   clk_unprepare(dev->gclk);
+   dev->gclk_use = 0;
+   }
}
 
return 0;
-- 
2.20.1



[PATCH 1/3] ASoC: mchp-i2s-mcc: Wait for RX/TX RDY only if controller is running

2019-08-20 Thread Codrin Ciubotariu
Since hw_free() can be called multiple times and not just after a stop
trigger command, we should check whether the RX or TX ready interrupt was
truly enabled previously. For this, we assure that the condition of the
wait event is always true, except when RX/TX interrupts are enabled.

Fixes: 7e0cdf545a55 ("ASoC: mchp-i2s-mcc: add driver for I2SC Multi-Channel 
Controller")
Signed-off-by: Codrin Ciubotariu 
---
 sound/soc/atmel/mchp-i2s-mcc.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/sound/soc/atmel/mchp-i2s-mcc.c b/sound/soc/atmel/mchp-i2s-mcc.c
index 86495883ca3f..8272915fa09b 100644
--- a/sound/soc/atmel/mchp-i2s-mcc.c
+++ b/sound/soc/atmel/mchp-i2s-mcc.c
@@ -686,22 +686,24 @@ static int mchp_i2s_mcc_hw_free(struct snd_pcm_substream 
*substream,
err = wait_event_interruptible_timeout(dev->wq_txrdy,
   dev->tx_rdy,
   msecs_to_jiffies(500));
+   if (err == 0) {
+   dev_warn_once(dev->dev,
+ "Timeout waiting for Tx ready\n");
+   regmap_write(dev->regmap, MCHP_I2SMCC_IDRA,
+MCHP_I2SMCC_INT_TXRDY_MASK(dev->channels));
+   dev->tx_rdy = 1;
+   }
} else {
err = wait_event_interruptible_timeout(dev->wq_rxrdy,
   dev->rx_rdy,
   msecs_to_jiffies(500));
-   }
-
-   if (err == 0) {
-   u32 idra;
-
-   dev_warn_once(dev->dev, "Timeout waiting for %s\n",
- is_playback ? "Tx ready" : "Rx ready");
-   if (is_playback)
-   idra = MCHP_I2SMCC_INT_TXRDY_MASK(dev->channels);
-   else
-   idra = MCHP_I2SMCC_INT_RXRDY_MASK(dev->channels);
-   regmap_write(dev->regmap, MCHP_I2SMCC_IDRA, idra);
+   if (err == 0) {
+   dev_warn_once(dev->dev,
+ "Timeout waiting for Rx ready\n");
+   regmap_write(dev->regmap, MCHP_I2SMCC_IDRA,
+MCHP_I2SMCC_INT_RXRDY_MASK(dev->channels));
+   dev->rx_rdy = 1;
+   }
}
 
if (!mchp_i2s_mcc_is_running(dev)) {
@@ -809,6 +811,8 @@ static int mchp_i2s_mcc_dai_probe(struct snd_soc_dai *dai)
 
init_waitqueue_head(>wq_txrdy);
init_waitqueue_head(>wq_rxrdy);
+   dev->tx_rdy = 1;
+   dev->rx_rdy = 1;
 
snd_soc_dai_init_dma_data(dai, >playback, >capture);
 
-- 
2.20.1



[PATCH 3/3] ASoC: mchp-i2s-mcc: Fix simultaneous capture and playback in master mode

2019-08-20 Thread Codrin Ciubotariu
This controller supports capture and playback running at the same time,
with the limitation that both capture and playback must be configured the
same way (sample rate, sample format, number of channels, etc). For this,
we have to assure that the configuration registers look the same when
capture and playback are initiated.
This patch fixes a bug in which the controller is in master mode and the
hw_params() callback fails for the second audio stream. The fail occurs
because the divisors are calculated after comparing the configuration
registers for capture and playback. The fix consists in calculating the
divisors before comparing the configuration registers. BCLK and LRC are
then configured and started only if the controller is not already running.

Fixes: 7e0cdf545a55 ("ASoC: mchp-i2s-mcc: add driver for I2SC Multi-Channel 
Controller")
Signed-off-by: Codrin Ciubotariu 
---
 sound/soc/atmel/mchp-i2s-mcc.c | 70 ++
 1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/sound/soc/atmel/mchp-i2s-mcc.c b/sound/soc/atmel/mchp-i2s-mcc.c
index ab7d5f98e759..befc2a3a05b0 100644
--- a/sound/soc/atmel/mchp-i2s-mcc.c
+++ b/sound/soc/atmel/mchp-i2s-mcc.c
@@ -392,11 +392,11 @@ static int mchp_i2s_mcc_clk_get_rate_diff(struct clk *clk,
 }
 
 static int mchp_i2s_mcc_config_divs(struct mchp_i2s_mcc_dev *dev,
-   unsigned int bclk, unsigned int *mra)
+   unsigned int bclk, unsigned int *mra,
+   unsigned long *best_rate)
 {
unsigned long clk_rate;
unsigned long lcm_rate;
-   unsigned long best_rate = 0;
unsigned long best_diff_rate = ~0;
unsigned int sysclk;
struct clk *best_clk = NULL;
@@ -423,7 +423,7 @@ static int mchp_i2s_mcc_config_divs(struct mchp_i2s_mcc_dev 
*dev,
 (clk_rate == bclk || clk_rate / (bclk * 2) <= GENMASK(5, 0));
 clk_rate += lcm_rate) {
ret = mchp_i2s_mcc_clk_get_rate_diff(dev->gclk, clk_rate,
-_clk, _rate,
+_clk, best_rate,
 _diff_rate);
if (ret) {
dev_err(dev->dev, "gclk error for rate %lu: %d",
@@ -437,7 +437,7 @@ static int mchp_i2s_mcc_config_divs(struct mchp_i2s_mcc_dev 
*dev,
}
 
ret = mchp_i2s_mcc_clk_get_rate_diff(dev->pclk, clk_rate,
-_clk, _rate,
+_clk, best_rate,
 _diff_rate);
if (ret) {
dev_err(dev->dev, "pclk error for rate %lu: %d",
@@ -459,33 +459,17 @@ static int mchp_i2s_mcc_config_divs(struct 
mchp_i2s_mcc_dev *dev,
 
dev_dbg(dev->dev, "source CLK is %s with rate %lu, diff %lu\n",
best_clk == dev->pclk ? "pclk" : "gclk",
-   best_rate, best_diff_rate);
-
-   /* set the rate */
-   ret = clk_set_rate(best_clk, best_rate);
-   if (ret) {
-   dev_err(dev->dev, "unable to set rate %lu to %s: %d\n",
-   best_rate, best_clk == dev->pclk ? "PCLK" : "GCLK",
-   ret);
-   return ret;
-   }
+   *best_rate, best_diff_rate);
 
/* Configure divisors */
if (dev->sysclk)
-   *mra |= MCHP_I2SMCC_MRA_IMCKDIV(best_rate / (2 * sysclk));
-   *mra |= MCHP_I2SMCC_MRA_ISCKDIV(best_rate / (2 * bclk));
+   *mra |= MCHP_I2SMCC_MRA_IMCKDIV(*best_rate / (2 * sysclk));
+   *mra |= MCHP_I2SMCC_MRA_ISCKDIV(*best_rate / (2 * bclk));
 
-   if (best_clk == dev->gclk) {
+   if (best_clk == dev->gclk)
*mra |= MCHP_I2SMCC_MRA_SRCCLK_GCLK;
-   ret = clk_prepare(dev->gclk);
-   if (ret < 0)
-   dev_err(dev->dev, "unable to prepare GCLK: %d\n", ret);
-   else
-   dev->gclk_use = 1;
-   } else {
+   else
*mra |= MCHP_I2SMCC_MRA_SRCCLK_PCLK;
-   dev->gclk_use = 0;
-   }
 
return 0;
 }
@@ -502,6 +486,7 @@ static int mchp_i2s_mcc_hw_params(struct snd_pcm_substream 
*substream,
  struct snd_pcm_hw_params *params,
  struct snd_soc_dai *dai)
 {
+   unsigned long rate = 0;
struct mchp_i2s_mcc_dev *dev = snd_soc_dai_get_drvdata(dai);
u32 mra = 0;
u32 mrb = 0;
@@ -640,6 +625,17 @@ static int mchp_i2s_mcc_hw_params(struct snd_pcm_substream 
*substream,
return -EINVAL;
}
 
+   if (set_divs) {
+   bclk_rate = frame_length * params_rate(params);
+   ret = mchp_i2s_mcc_config_divs(dev, bclk_rate, ,
+

[PATCH 0/3] ASoC: mchp-i2s-mcc: Several fixes

2019-08-20 Thread Codrin Ciubotariu
This pathset fixes some issues detected while testing some more the
Microchip I2S multichannel controller. The first two patches fix some
issues that appear mostly when hw_free() and hw_params() callbacks
are called multiple times. The third patch fixes a problem caused
when the controller is in master mode and both capture and playback 
are played at the same time.

All three patches have a "Fixes" tag. Although they are independent,
some conflicts might appear if they are not applied in the order
presented in this patchset. If so, please let me know so I can rebase
them.

Codrin Ciubotariu (3):
  ASoC: mchp-i2s-mcc: Wait for RX/TX RDY only if controller is running
  ASoC: mchp-i2s-mcc: Fix unprepare of GCLK
  ASoC: mchp-i2s-mcc: Fix simultaneous capture and playback in master
mode

 sound/soc/atmel/mchp-i2s-mcc.c | 111 +++--
 1 file changed, 64 insertions(+), 47 deletions(-)

-- 
2.20.1



Re: [Linux-kernel-mentees][PATCH v6 1/2] sgi-gru: Convert put_page() to put_user_page*()

2019-08-20 Thread Bharath Vedartham
On Mon, Aug 19, 2019 at 12:30:18PM -0700, John Hubbard wrote:
> On 8/19/19 12:06 PM, Bharath Vedartham wrote:
> >On Mon, Aug 19, 2019 at 07:56:11AM -0500, Dimitri Sivanich wrote:
> >>Reviewed-by: Dimitri Sivanich 
> >Thanks!
> >
> >John, would you like to take this patch into your miscellaneous
> >conversions patch set?
> >
> 
> (+Andrew and Michal, so they know where all this is going.)
> 
> Sure, although that conversion series [1] is on a brief hold, because
> there are additional conversions desired, and the API is still under
> discussion. Also, reading between the lines of Michal's response [2]
> about it, I think people would prefer that the next revision include
> the following, for each conversion site:
> 
> Conversion of gup/put_page sites:
> 
> Before:
> 
>   get_user_pages(...);
>   ...
>   for each page:
>   put_page();
> 
> After:
>   
>   gup_flags |= FOLL_PIN; (maybe FOLL_LONGTERM in some cases)
>   vaddr_pin_user_pages(...gup_flags...)
>   ...
>   vaddr_unpin_user_pages(); /* which invokes put_user_page() */
> 
> Fortunately, it's not harmful for the simpler conversion from put_page()
> to put_user_page() to happen first, and in fact those have usually led
> to simplifications, paving the way to make it easier to call
> vaddr_unpin_user_pages(), once it's ready. (And showing exactly what
> to convert, too.)
> 
> So for now, I'm going to just build on top of Ira's tree, and once the
> vaddr*() API settles down, I'll send out an updated series that attempts
> to include the reviews and ACKs so far (I'll have to review them, but
> make a note that review or ACK was done for part of the conversion),
> and adds the additional gup(FOLL_PIN), and uses vaddr*() wrappers instead of
> gup/pup.
> 
> [1] https://lore.kernel.org/r/20190807013340.9706-1-jhubb...@nvidia.com
> 
> [2] https://lore.kernel.org/r/20190809175210.gr18...@dhcp22.suse.cz
> 
Cc' lkml(I missed out the 'l' in this series). 

sounds good. It makes sense to keep the entire gup in the kernel rather
than to expose it outside. 

I ll make sure to checkout the emails on vaddr*() API and pace my work
on it accordingly.

Thank you
Bharath
> thanks,
> -- 
> John Hubbard
> NVIDIA


Re: [PATCH v8 11/28] x86/asm/head: annotate data appropriatelly

2019-08-20 Thread Borislav Petkov
> Subject: Re: [PATCH v8 11/28] x86/asm/head: annotate data appropriatelly

appropriately

On Thu, Aug 08, 2019 at 12:38:37PM +0200, Jiri Slaby wrote:
> Use the new SYM_DATA, SYM_DATA_START, and SYM_DATA_END in both 32 and 64
> bit heads.  In the 64-bit version, define also
> SYM_DATA_START_PAGE_ALIGNED locally using the new SYM_START. It is used
> in the code instead of NEXT_PAGE() which was defined in this file and
> has been using the obsolete macro GLOBAL().
> 
> Now, the data in the 64-bit object file look sane:
> Value   Size TypeBind   Vis  Ndx Name
>     4096 OBJECT  GLOBAL DEFAULT   15 init_level4_pgt
>   1000  4096 OBJECT  GLOBAL DEFAULT   15 level3_kernel_pgt
>   2000  2048 OBJECT  GLOBAL DEFAULT   15 level2_kernel_pgt
>   3000  4096 OBJECT  GLOBAL DEFAULT   15 level2_fixmap_pgt
>   4000  4096 OBJECT  GLOBAL DEFAULT   15 level1_fixmap_pgt
>   5000 2 OBJECT  GLOBAL DEFAULT   15 early_gdt_descr
>   5002 8 OBJECT  LOCAL  DEFAULT   15 early_gdt_descr_base
>   500a 8 OBJECT  GLOBAL DEFAULT   15 phys_base
>    8 OBJECT  GLOBAL DEFAULT   17 initial_code
>   0008 8 OBJECT  GLOBAL DEFAULT   17 initial_gs
>   0010 8 OBJECT  GLOBAL DEFAULT   17 initial_stack
>    4 OBJECT  GLOBAL DEFAULT   19 early_recursion_flag
>   1000  4096 OBJECT  GLOBAL DEFAULT   19 early_level4_pgt
>   2000 0x4 OBJECT  GLOBAL DEFAULT   19 early_dynamic_pgts
>     4096 OBJECT  GLOBAL DEFAULT   22 empty_zero_page
> 
> All have correct size and type.

Nice.

> 
> Note, that we can now see that it might be worth pushing
> early_recursion_flag after early_dynamic_pgts -- we are wasting almost
> 4K of .init.data.

Yes, please do in a separate patch which can even go separately. I get
here:

---
Disassembly of section .init.data:

82684000 :
...

82685000 :
...

82686000 :
...

826c6000 :
...

826c6020 :
---


vs


---
Disassembly of section .init.data:

82684000 :
...

82685000 :
...

826c5000 :
826c5000:   00 00   add%al,(%rax)
...

826c5004 :
...

826c5020 :
---

That's exactly 4K saved.


> Signed-off-by: Jiri Slaby 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> ---
>  arch/x86/kernel/head_32.S | 29 ---
>  arch/x86/kernel/head_64.S | 78 +--
>  2 files changed, 58 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> index 0bae769b7b59..2d5390d84467 100644
> --- a/arch/x86/kernel/head_32.S
> +++ b/arch/x86/kernel/head_32.S
> @@ -502,8 +502,7 @@ ENDPROC(early_ignore_irq)
>  
>  __INITDATA
>   .align 4
> -GLOBAL(early_recursion_flag)
> - .long 0
> +SYM_DATA(early_recursion_flag, .long 0)
>  
>  __REFDATA
>   .align 4
> @@ -551,7 +550,7 @@ EXPORT_SYMBOL(empty_zero_page)
>  __PAGE_ALIGNED_DATA
>   /* Page-aligned for the benefit of paravirt? */
>   .align PGD_ALIGN
> -ENTRY(initial_page_table)
> +SYM_DATA_START(initial_page_table)
>   .long   pa(initial_pg_pmd+PGD_IDENT_ATTR),0 /* low identity map */
>  # if KPMDS == 3
>   .long   pa(initial_pg_pmd+PGD_IDENT_ATTR),0
> @@ -569,17 +568,18 @@ ENTRY(initial_page_table)
>  #  error "Kernel PMDs should be 1, 2 or 3"
>  # endif
>   .align PAGE_SIZE/* needs to be page-sized too */
> +SYM_DATA_END(initial_page_table)
>  #endif
>  
>  .data
>  .balign 4
> -ENTRY(initial_stack)
> - /*
> -  * The SIZEOF_PTREGS gap is a convention which helps the in-kernel
> -  * unwinder reliably detect the end of the stack.
> -  */
> - .long init_thread_union + THREAD_SIZE - SIZEOF_PTREGS - \
> -   TOP_OF_KERNEL_STACK_PADDING;
> +/*
> + * The SIZEOF_PTREGS gap is a convention which helps the in-kernel unwinder
> + * reliably detect the end of the stack.
> + */
> +SYM_DATA(initial_stack,
> + .long init_thread_union + THREAD_SIZE -
> + SIZEOF_PTREGS - TOP_OF_KERNEL_STACK_PADDING)
>  
>  __INITRODATA
>  int_msg:
> @@ -600,22 +600,25 @@ int_msg:
>   ALIGN
>  # early boot GDT descriptor (must use 1:1 address mapping)
>   .word 0 # 32 bit align gdt_desc.address
> -boot_gdt_descr:
> +SYM_DATA_START(boot_gdt_descr)
>   .word __BOOT_DS+7
>   .long boot_gdt - __PAGE_OFFSET
> +SYM_DATA_END(boot_gdt_descr)

So there's one "globl boot_gdt_descr" above already and this turns into:

 .data
.globl boot_gdt_descr
^

 .align 4,0x90
 # early boot GDT descriptor (must use 1:1 address mapping)
 .word 0 # 32 bit align gdt_desc.address
.globl boot_gdt_descr ; ; boot_gdt_descr:
^

I guess you can remove the above one.

Also, this can be made a local symbol too.

>  # boot GDT descriptor (later on used by CPU#0):
>   .word 0 # 32 bit align gdt_desc.address
> 

Re: [PATCH v2 1/2] dt-bindings: media: Add YAML schemas for the generic RC bindings

2019-08-20 Thread Sean Young
On Tue, Aug 20, 2019 at 10:52:29AM -0500, Rob Herring wrote:
> On Tue, Aug 20, 2019 at 4:50 AM Maxime Ripard  wrote:
> > On Tue, Aug 20, 2019 at 09:15:26AM +0100, Sean Young wrote:
> > > On Mon, Aug 19, 2019 at 08:26:18PM +0200, Maxime Ripard wrote:
> > > > From: Maxime Ripard 
> > > >
> > > > The RC controllers have a bunch of generic properties that are needed 
> > > > in a
> > > > device tree. Add a YAML schemas for those.
> > > >
> > > > Reviewed-by: Rob Herring 
> > > > Signed-off-by: Maxime Ripard 
> > >
> > > For the series (both 1/2 and 2.2):
> > >
> > > Reviewed-by: Sean Young 
> > >
> > > How's tree should this go through?
> >
> > Either yours or Rob's, I guess?
> 
> Sean's because there are other changes to
> Documentation/devicetree/bindings/media/sunxi-ir.txt in -next.

Good point, I'll take them.

Thanks
Sean


Re: [PATCH 2/6] dt-bindings: net: sun8i-a83t-emac: Add phy-io-supply property

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 9:53 AM  wrote:
>
> From: Ondrej Jirman 
>
> Some PHYs require separate power supply for I/O pins in some modes
> of operation. Add phy-io-supply property, to allow enabling this
> power supply.

Perhaps since this is new, such phys should have *-supply in their nodes.

>
> Signed-off-by: Ondrej Jirman 
> ---
>  .../devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml| 4 
>  1 file changed, 4 insertions(+)


Re: [PATCH 3/6] net: stmmac: sun8i: Use devm_regulator_get for PHY regulator

2019-08-20 Thread Ondřej Jirman
Hi,

On Tue, Aug 20, 2019 at 05:57:44PM +0200, Andrew Lunn wrote:
> On Tue, Aug 20, 2019 at 05:47:14PM +0200, Ondřej Jirman wrote:
> > Hi Andrew,
> > 
> > On Tue, Aug 20, 2019 at 05:39:39PM +0200, Andrew Lunn wrote:
> > > On Tue, Aug 20, 2019 at 04:53:40PM +0200, meg...@megous.com wrote:
> > > > From: Ondrej Jirman 
> > > > 
> > > > Use devm_regulator_get instead of devm_regulator_get_optional and rely
> > > > on dummy supply. This avoids NULL checks before regulator_enable/disable
> > > > calls.
> > > 
> > > Hi Ondrej
> > > 
> > > What do you mean by a dummy supply? I'm just trying to make sure you
> > > are not breaking backwards compatibility.
> > 
> > Sorry, I mean dummy regulator. See:
> > 
> > https://elixir.bootlin.com/linux/latest/source/drivers/regulator/core.c#L1874
> > 
> > On systems that use DT (i.e. have_full_constraints() == true), when the
> > regulator is not found (ENODEV, not specified in DT), regulator_get will 
> > return
> > a fake dummy regulator that can be enabled/disabled, but doesn't do anything
> > real.
> 
> Hi Ondrej
> 
> But we also gain a new warning:
> 
>   dev_warn(dev,
>"%s supply %s not found, using dummy regulator\n",
>devname, id);
> 
> This regulator is clearly optional, so there should not be a warning.
> 
> Maybe you can add a new get_type, OPTIONAL_GET, which does not issue
> the warning, but does give back a dummy regulator.

We already had a info message. See my other e-mail with the dmesg output.

IMO, that warning is useful during development, and more informative than the
previous one.

regards,
o.

> Thanks
>   Andrew
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH 1/6] dt-bindings: net: sun8i-a83t-emac: Add phy-supply property

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 9:53 AM  wrote:
>
> From: Ondrej Jirman 
>
> This is already supported by the driver, but is missing from the
> bindings.

Really, the supply for the phy should be in the phy's node...

>
> Signed-off-by: Ondrej Jirman 
> ---
>  .../devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml| 4 
>  1 file changed, 4 insertions(+)

Reviewed-by: Rob Herring 


Re: [PATCH 5/6] dt-bindings: arm: amlogic: add SEI Robotics SEI610 bindings

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 9:41 AM Neil Armstrong  wrote:
>
> Add the compatible for the Amlogic SM1 Based SEI610 board.
>
> Signed-off-by: Neil Armstrong 
> ---
>  Documentation/devicetree/bindings/arm/amlogic.yaml | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Rob Herring 


Re: [PATCH 4/6] dt-bindings: arm: amlogic: add SM1 bindings

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 9:41 AM Neil Armstrong  wrote:
>
> Add bindings for the new Amlogic SM1 SoC Family.
>
> It a derivative of the G12A SoC Family with :
> - Cortex-A55 core instead of A53
> - more power domains
> - a neural network co-processor
> - a CSI input and image processor
>
> Signed-off-by: Neil Armstrong 
> ---
>  Documentation/devicetree/bindings/arm/amlogic.yaml | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Rob Herring 


Re: [PATCH 1/1] netfilter: nf_tables: fib: Drop IPV6 packages if IPv6 is disabled on boot

2019-08-20 Thread Leonardo Bras
On Tue, 2019-08-20 at 07:36 +0200, Florian Westphal wrote:
> Wouldn't fib_netdev.c have the same problem?
Probably, but I haven't hit this issue yet.

> If so, might be better to place this test in both
> nft_fib6_eval_type and nft_fib6_eval.
I think that is possible, and not very hard to do.

But in my humble viewpoint, it looks like it's nft_fib_inet_eval() and
nft_fib_netdev_eval() have the responsibility to choose a valid
protocol or drop the package. 
I am not sure if it would be a good move to transfer this
responsibility to nft_fib6_eval_type() and nft_fib6_eval(), so I would
rather add the same test to nft_fib_netdev_eval().

Does it make sense?

Thanks for the feedback!

Leonardo Bras



signature.asc
Description: This is a digitally signed message part


Re: [PATCH] sched/core: Schedule new worker even if PI-blocked

2019-08-20 Thread Sebastian Andrzej Siewior
On 2019-08-20 18:02:17 [+0200], Peter Zijlstra wrote:
> On Tue, Aug 20, 2019 at 05:54:01PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-08-20 17:20:25 [+0200], Peter Zijlstra wrote:
> 
> > > And am I right in thinking that that, again, is specific to the
> > > sleeping-spinlocks from PREEMPT_RT? Is there really nothing else that
> > > identifies those more specifically? It's been a while since I looked at
> > > them.
> > 
> > Not really. I hacked "int sleeping_lock" into task_struct which is
> > incremented each time a "sleeping lock" version of rtmutex is requested.
> > We have two users as of now:
> > - RCU, which checks if we schedule() while holding rcu_read_lock() which
> >   is okay if it is a sleeping lock.
> > 
> > - NOHZ's pending softirq detection while going to idle. It is possible
> >   that "ksoftirqd" and "current" are blocked on locks and the CPU goes
> >   to idle (because nothing else is runnable) with pending softirqs.
> > 
> > I wanted to let rtmutex invoke another schedule() function in case of a
> > sleeping lock to avoid the RCU warning. This would avoid incrementing
> > "sleeping_lock" in the fast path. But then I had no idea what to do with
> > the NOHZ thing.
> 
> Once upon a time there was also a shadow task->state thing, that was
> specific to the sleeping locks, because normally spinlocks don't muck
> with task->state and so we have code relying on it not getting trampled.
> 
> Can't we use that somewhow? Or is that gone?

we have ->state and ->saved_state. While sleeping on a sleeping lock
->state goes to ->saved_state (usually TASK_RUNNING) and ->state becomes
TASK_UNINTERRUPTIBLE. This is no different compared to regular
blocked-on-I/O wait.
We could add a state, say, TASK_LOCK_BLOCK to identify a task blocking
on sleeping lock. This shouldn't break anything. After all only a
regular "unlock" is allowed to wake such a task and "non-matching" wakes
are redirected to update ->saved_state.

> > > Also, I suppose it would be really good to put that in a comment.
> > So, what does that mean for that patch. According to my inbox it has
> > applied to an "urgent" branch. Do I resubmit the whole thing or just a
> > comment on top?
> 
> Yeah, I'm not sure. I was surprised by that, because afaict all this is
> PREEMPT_RT specific and not really /urgent material in the first place.
> Ingo?

Sebastian


Re: linux-next: Tree for Aug 20 (mm/memcontrol)

2019-08-20 Thread Roman Gushchin
On Tue, Aug 20, 2019 at 07:59:54AM -0700, Randy Dunlap wrote:
> On 8/20/19 12:09 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Changes since 20190819:
> > 
> 
> on i386 or x86_64:
> 
> ../mm/memcontrol.c: In function ‘__mem_cgroup_free’:
> ../mm/memcontrol.c:4885:2: error: implicit declaration of function 
> ‘memcg_flush_percpu_vmstats’; did you mean ‘qdisc_is_percpu_stats’? 
> [-Werror=implicit-function-declaration]
>   memcg_flush_percpu_vmstats(memcg, false);
>   ^~
>   qdisc_is_percpu_stats
> ../mm/memcontrol.c:4886:2: error: implicit declaration of function 
> ‘memcg_flush_percpu_vmevents’; did you mean ‘memcg_check_events’? 
> [-Werror=implicit-function-declaration]
>   memcg_flush_percpu_vmevents(memcg);
>   ^~~
>   memcg_check_events
> 
> 
> 
> Full i386 randconfig file is attached.

Hi Randy!

The issue has already been fixed ( https://lkml.org/lkml/2019/8/19/1007 ),
and Andrew has picked an updated version to the mm tree.
So it will be resolved soon.

Thanks!


[PATCH] iommu/vt-d: Fix wrong analysis whether devices share the same bus

2019-08-20 Thread Nadav Amit
set_msi_sid_cb() is used to determine whether device aliases share the
same bus, but it can provide false indications that aliases use the same
bus when in fact they do not. The reason is that set_msi_sid_cb()
assumes that pdev is fixed, while actually pci_for_each_dma_alias() can
call fn() when pdev is set to a subordinate device.

As a result, running an VM on ESX with VT-d emulation enabled can
results in the log warning such as:

  DMAR: [INTR-REMAP] Request device [00:11.0] fault index 3b [fault reason 38] 
Blocked an interrupt request due to source-id verification failure

This seems to cause additional ata errors such as:
  ata3.00: qc timeout (cmd 0xa1)
  ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)

These timeouts also cause boot to be much longer and other errors.

Fix it by checking comparing the alias with the previous one instead.

Fixes: 3f0c625c6ae71 ("iommu/vt-d: Allow interrupts from the entire bus for 
aliased devices")
Cc: sta...@vger.kernel.org
Cc: Logan Gunthorpe 
Cc: David Woodhouse 
Cc: Joerg Roedel 
Cc: Jacob Pan 
Signed-off-by: Nadav Amit 
---
 drivers/iommu/intel_irq_remapping.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index 4786ca061e31..81e43c1df7ec 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -376,13 +376,13 @@ static int set_msi_sid_cb(struct pci_dev *pdev, u16 
alias, void *opaque)
 {
struct set_msi_sid_data *data = opaque;
 
+   if (data->count == 0 || PCI_BUS_NUM(alias) == PCI_BUS_NUM(data->alias))
+   data->busmatch_count++;
+
data->pdev = pdev;
data->alias = alias;
data->count++;
 
-   if (PCI_BUS_NUM(alias) == pdev->bus->number)
-   data->busmatch_count++;
-
return 0;
 }
 
-- 
2.17.1



Re: [PATCH] arm64: perf_event: Add missing header needed for smp_processor_id()

2019-08-20 Thread Will Deacon
On Tue, Aug 20, 2019 at 05:06:29PM +0100, Mark Rutland wrote:
> On Tue, Aug 20, 2019 at 04:57:45PM +0100, Raphael Gault wrote:
> 
> It would be worth having a body for the commit message like:
> 
> | in perf_event.c we use smp_processor_id(), but we haven't included 
> |  where it is defined, and rely on this being pulled in 
> | via a transitive include. Let's make this more robust by including
> |  explciitly.
> 
> ... and with that, my Acked-by stands.

Queued for 5.4. with typo fixed above.

Will


Re: [PATCH] MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h

2019-08-20 Thread Peter Zijlstra
On Thu, Aug 15, 2019 at 10:22:07PM +0200, Thomas Gleixner wrote:

> We have the following existing _SHORT variants:
> 
> _G
> _EP
> _EX
> _CORE
> _ULT
> _GT3E
> _XEON_D
> _MOBILE
> _DESKTOP
> _NNPI
> _MID
> _TABLET
> _PLUS

Your list is missing: _L and _X.

_X is the generic 'server'

And we have only MEROM_L, which, afaict, is a mobile variant of MEROM.
Now, I just send out patches doing s/_MOBILE/_ULT/, so I suppose this
then should be MEROM_ULT.

Or we go the other way and do: s/_ULT/_MOBILE/, in which case this
becomes: MEROM_MOBILE.

No strong feelings either way.


Re: [v5 PATCH 4/4] mm: thp: make deferred split shrinker memcg aware

2019-08-20 Thread Yang Shi




On 8/20/19 4:06 AM, Kirill Tkhai wrote:

On 07.08.2019 05:17, Yang Shi wrote:

Currently THP deferred split shrinker is not memcg aware, this may cause
premature OOM with some configuration. For example the below test would
run into premature OOM easily:

$ cgcreate -g memory:thp
$ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
$ cgexec -g memory:thp transhuge-stress 4000

transhuge-stress comes from kernel selftest.

It is easy to hit OOM, but there are still a lot THP on the deferred
split queue, memcg direct reclaim can't touch them since the deferred
split shrinker is not memcg aware.

Convert deferred split shrinker memcg aware by introducing per memcg
deferred split queue.  The THP should be on either per node or per memcg
deferred split queue if it belongs to a memcg.  When the page is
immigrated to the other memcg, it will be immigrated to the target
memcg's deferred split queue too.

Reuse the second tail page's deferred_list for per memcg list since the
same THP can't be on multiple deferred split queues.

Cc: Kirill Tkhai 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: "Kirill A . Shutemov" 
Cc: Hugh Dickins 
Cc: Shakeel Butt 
Cc: David Rientjes 
Cc: Qian Cai 
Acked-by: Kirill A. Shutemov 
Signed-off-by: Yang Shi 

Reviewed-by: Kirill Tkhai 

But, please, see below one small suggestion.


---
  include/linux/huge_mm.h|  9 ++
  include/linux/memcontrol.h |  4 +++
  include/linux/mm_types.h   |  1 +
  mm/huge_memory.c   | 76 +++---
  mm/memcontrol.c| 24 +++
  5 files changed, 103 insertions(+), 11 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 45ede62..61c9ffd 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -267,6 +267,15 @@ static inline bool thp_migration_supported(void)
return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION);
  }
  
+static inline struct list_head *page_deferred_list(struct page *page)

+{
+   /*
+* Global or memcg deferred list in the second tail pages is
+* occupied by compound_head.
+*/
+   return [2].deferred_list;
+}
+
  #else /* CONFIG_TRANSPARENT_HUGEPAGE */
  #define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
  #define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 5771816..cace365 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -312,6 +312,10 @@ struct mem_cgroup {
struct list_head event_list;
spinlock_t event_list_lock;
  
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE

+   struct deferred_split deferred_split_queue;
+#endif
+
struct mem_cgroup_per_node *nodeinfo[0];
/* WARNING: nodeinfo must be the last member here */
  };
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3a37a89..156640c 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -139,6 +139,7 @@ struct page {
struct {/* Second tail page of compound page */
unsigned long _compound_pad_1;  /* compound_head */
unsigned long _compound_pad_2;
+   /* For both global and memcg */
struct list_head deferred_list;
};
struct {/* Page table pages */
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e0d8e08..c9a596e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -495,11 +495,25 @@ pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct 
*vma)
return pmd;
  }
  
-static inline struct list_head *page_deferred_list(struct page *page)

+#ifdef CONFIG_MEMCG
+static inline struct deferred_split *get_deferred_split_queue(struct page 
*page)
  {
-   /* ->lru in the tail pages is occupied by compound_head. */
-   return [2].deferred_list;
+   struct mem_cgroup *memcg = compound_head(page)->mem_cgroup;
+   struct pglist_data *pgdat = NODE_DATA(page_to_nid(page));
+
+   if (memcg)
+   return >deferred_split_queue;
+   else
+   return >deferred_split_queue;
+}
+#else
+static inline struct deferred_split *get_deferred_split_queue(struct page 
*page)
+{
+   struct pglist_data *pgdat = NODE_DATA(page_to_nid(page));
+
+   return >deferred_split_queue;
  }
+#endif
  
  void prep_transhuge_page(struct page *page)

  {
@@ -2658,7 +2672,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
  {
struct page *head = compound_head(page);
struct pglist_data *pgdata = NODE_DATA(page_to_nid(head));
-   struct deferred_split *ds_queue = >deferred_split_queue;
+   struct deferred_split *ds_queue = get_deferred_split_queue(page);
struct anon_vma *anon_vma = NULL;
struct address_space *mapping = NULL;
int count, mapcount, extra_pins, ret;
@@ -2794,8 +2808,7 @@ int split_huge_page_to_list(struct page *page, struct 

Re: [Question-kvm] Can hva_to_pfn_fast be executed in interrupt context?

2019-08-20 Thread Bharath Vedartham
On Thu, Aug 15, 2019 at 08:26:43PM +0200, Paolo Bonzini wrote:
> Oh, I see. Sorry I didn't understand the question. In the case of KVM,
> there's simply no code that runs in interrupt context and needs to use
> virtual addresses.
> 
> In fact, there's no code that runs in interrupt context at all. The only
> code that deals with host interrupts in a virtualization host is in VFIO,
> but all it needs to do is signal an eventfd.
> 
> Paolo
Great, answers my question. Thank you for your time.

Thank you
Bharath
> 
> Il gio 15 ago 2019, 19:18 Bharath Vedartham  ha
> scritto:
> 
> > On Tue, Aug 13, 2019 at 10:17:09PM +0200, Paolo Bonzini wrote:
> > > On 13/08/19 21:14, Bharath Vedartham wrote:
> > > > Hi all,
> > > >
> > > > I was looking at the function hva_to_pfn_fast(in virt/kvm/kvm_main)
> > which is
> > > > executed in an atomic context(even in non-atomic context, since
> > > > hva_to_pfn_fast is much faster than hva_to_pfn_slow).
> > > >
> > > > My question is can this be executed in an interrupt context?
> > >
> > > No, it cannot for the reason you mention below.
> > >
> > > Paolo
> > hmm.. Well I expected the answer to be kvm specific.
> > Because I observed a similar use-case for a driver (sgi-gru) where
> > we want to retrive the physical address of a virtual address. This was
> > done in atomic and non-atomic context similar to hva_to_pfn_fast and
> > hva_to_pfn_slow. __get_user_pages_fast(for atomic case)
> > would not work as the driver could execute in interrupt context.
> >
> > The driver manually walked the page tables to handle this issue.
> >
> > Since kvm is a widely used piece of code, I asked this question to know
> > how kvm handled this issue.
> >
> > Thank you for your time.
> >
> > Thank you
> > Bharath
> > > > The motivation for this question is that in an interrupt context, we
> > cannot
> > > > assume "current" to be the task_struct of the process of interest.
> > > > __get_user_pages_fast assume current->mm when walking the process page
> > > > tables.
> > > >
> > > > So if this function hva_to_pfn_fast can be executed in an
> > > > interrupt context, it would not be safe to retrive the pfn with
> > > > __get_user_pages_fast.
> > > >
> > > > Thoughts on this?
> > > >
> > > > Thank you
> > > > Bharath
> > > >
> > >
> >


Re: [v5 PATCH 3/4] mm: shrinker: make shrinker not depend on memcg kmem

2019-08-20 Thread Yang Shi




On 8/20/19 4:01 AM, Kirill Tkhai wrote:

On 07.08.2019 05:17, Yang Shi wrote:

Currently shrinker is just allocated and can work when memcg kmem is
enabled.  But, THP deferred split shrinker is not slab shrinker, it
doesn't make too much sense to have such shrinker depend on memcg kmem.
It should be able to reclaim THP even though memcg kmem is disabled.

Introduce a new shrinker flag, SHRINKER_NONSLAB, for non-slab shrinker.
When memcg kmem is disabled, just such shrinkers can be called in
shrinking memcg slab.

Cc: Kirill Tkhai 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: "Kirill A . Shutemov" 
Cc: Hugh Dickins 
Cc: Shakeel Butt 
Cc: David Rientjes 
Cc: Qian Cai 
Acked-by: Kirill A. Shutemov 
Signed-off-by: Yang Shi 

Looks OK for me. But some doubts about naming.

SHRINKER_NONSLAB. There are a lot of shrinkers, which are not
related to slab. For example, mmu_shrinker in arch/x86/kvm/mmu.c.
Intuitively and without mm knowledge, I assume, I would be surprised
why it's not masked as NONSLAB. Can we improve this in some way?


Actually, SHRINKER_NONSLAB just makes sense when the shrinker is also 
MEMCG_AWARE for now.


I didn't think of a better name, any suggestion? I could add some 
comment to explain this, non-MEMCG_AWARE shrinker should not care about 
setting this flag even though it is non-slab.


And, once this patch is in Linus's tree, I will double check if there is 
any MEMCG_AWARE non-slab shrinker although my quick search didn't show 
others except inode/dcache and workingset node.




The rest looks OK for me.

Reviewed-by: Kirill Tkhai 


---
  include/linux/memcontrol.h | 19 ---
  include/linux/shrinker.h   |  3 ++-
  mm/memcontrol.c|  9 +--
  mm/vmscan.c| 60 --
  4 files changed, 45 insertions(+), 46 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 44c4146..5771816 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -128,9 +128,8 @@ struct mem_cgroup_per_node {
  
  	struct mem_cgroup_reclaim_iter	iter[DEF_PRIORITY + 1];
  
-#ifdef CONFIG_MEMCG_KMEM

struct memcg_shrinker_map __rcu *shrinker_map;
-#endif
+
struct rb_node  tree_node;  /* RB tree node */
unsigned long   usage_in_excess;/* Set to the value by which */
/* the soft limit is exceeded*/
@@ -1253,6 +1252,11 @@ static inline bool 
mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
} while ((memcg = parent_mem_cgroup(memcg)));
return false;
  }
+
+extern int memcg_expand_shrinker_maps(int new_id);
+
+extern void memcg_set_shrinker_bit(struct mem_cgroup *memcg,
+  int nid, int shrinker_id);
  #else
  #define mem_cgroup_sockets_enabled 0
  static inline void mem_cgroup_sk_alloc(struct sock *sk) { };
@@ -1261,6 +1265,11 @@ static inline bool 
mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
  {
return false;
  }
+
+static inline void memcg_set_shrinker_bit(struct mem_cgroup *memcg,
+ int nid, int shrinker_id)
+{
+}
  #endif
  
  struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep);

@@ -1332,10 +1341,6 @@ static inline int memcg_cache_id(struct mem_cgroup 
*memcg)
return memcg ? memcg->kmemcg_id : -1;
  }
  
-extern int memcg_expand_shrinker_maps(int new_id);

-
-extern void memcg_set_shrinker_bit(struct mem_cgroup *memcg,
-  int nid, int shrinker_id);
  #else
  
  static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)

@@ -1377,8 +1382,6 @@ static inline void memcg_put_cache_ids(void)
  {
  }
  
-static inline void memcg_set_shrinker_bit(struct mem_cgroup *memcg,

- int nid, int shrinker_id) { }
  #endif /* CONFIG_MEMCG_KMEM */
  
  #endif /* _LINUX_MEMCONTROL_H */

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 9443caf..9e112d6 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -69,7 +69,7 @@ struct shrinker {
  
  	/* These are for internal use */

struct list_head list;
-#ifdef CONFIG_MEMCG_KMEM
+#ifdef CONFIG_MEMCG
/* ID in shrinker_idr */
int id;
  #endif
@@ -81,6 +81,7 @@ struct shrinker {
  /* Flags */
  #define SHRINKER_NUMA_AWARE   (1 << 0)
  #define SHRINKER_MEMCG_AWARE  (1 << 1)
+#define SHRINKER_NONSLAB   (1 << 2)
  
  extern int prealloc_shrinker(struct shrinker *shrinker);

  extern void register_shrinker_prepared(struct shrinker *shrinker);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cdbb7a8..d90ded1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -313,6 +313,7 @@ void memcg_put_cache_ids(void)
  EXPORT_SYMBOL(memcg_kmem_enabled_key);
  
  struct workqueue_struct *memcg_kmem_cache_wq;

+#endif
  
  static int memcg_shrinker_map_size;

  static 

Re: [PATCH v2 1/2] dt-bindings: serial: lantiq: Convert to YAML schema

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 3:29 AM Rahul Tanwar
 wrote:
>
> Convert the existing DT binding document for Lantiq SoC ASC serial controller
> from txt format to YAML format.
>
> Signed-off-by: Rahul Tanwar 
> ---
>  .../devicetree/bindings/serial/lantiq_asc.txt  | 31 --
>  .../devicetree/bindings/serial/lantiq_asc.yaml | 70 
> ++

Use the compatible name: lantiq,asc.yaml

Don't forget the $id value too.

>  2 files changed, 70 insertions(+), 31 deletions(-)
>  delete mode 100644 Documentation/devicetree/bindings/serial/lantiq_asc.txt
>  create mode 100644 Documentation/devicetree/bindings/serial/lantiq_asc.yaml


> diff --git a/Documentation/devicetree/bindings/serial/lantiq_asc.yaml 
> b/Documentation/devicetree/bindings/serial/lantiq_asc.yaml
> new file mode 100644
> index ..54b90490f4fb
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/serial/lantiq_asc.yaml
> @@ -0,0 +1,70 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/serial/lantiq_asc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Lantiq SoC ASC serial controller
> +
> +maintainers:
> +  - Rahul Tanwar 
> +
> +allOf:
> +  - $ref: /schemas/serial.yaml#
> +
> +properties:
> +  compatible:
> +oneOf:
> +  items:
> +- const: lantiq,asc
> +
> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +minItems: 1

Technically, 1 item is not allowed until patch 2 (or the old doc was wrong).

> +maxItems: 3
> +items:
> +  - description: tx or combined interrupt
> +  - description: rx interrupt
> +  - description: err interrupt
> +
> +  clocks:
> +description:
> +  When present, first entry listed should contain phandle
> +  to the frequency clock and second entry should contain
> +  phandle to the gate clock.

Schema needs to define how many entries:

items:
  - description: ...
  - description: ...

> +
> +  clock-names:
> +items:
> +  - const: freq
> +  - const: asc
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +
> +
> +examples:
> +  - |
> +asc0: serial@1660 {
> +compatible = "lantiq,asc";
> +reg = <0x1660 0x10>;
> +interrupt-parent = <>;
> +interrupts = ,
> + ,
> + ;
> +clocks = < CLK_SSX4>, < GCLK_UART>;
> +clock-names = "freq", "asc";
> +};
> +
> +  - |
> +asc1: serial@e100c00 {

I don't think this 2nd example adds anything.

> +compatible = "lantiq,asc";
> +reg = <0xE100C00 0x400>;
> +interrupt-parent = <>;
> +interrupts = <112 113 114>;
> +};
> +
> +...
> --
> 2.11.0
>


Re: [PATCH v3] dt-bindings: arm: Convert Actions Semi bindings to jsonschema

2019-08-20 Thread Manivannan Sadhasivam
On Fri, Jun 14, 2019 at 01:33:47PM -0600, Rob Herring wrote:
> On Fri, Jun 14, 2019 at 11:07 AM Andreas Färber  wrote:
> >
> > Am 14.06.19 um 19:04 schrieb Manivannan Sadhasivam:
> > > On Thu, Jun 13, 2019 at 04:44:35PM -0600, Rob Herring wrote:
> > >> On Fri, May 17, 2019 at 10:32:23AM -0500, Rob Herring wrote:
> > >>> Convert Actions Semi SoC bindings to DT schema format using json-schema.
> > >>>
> > >>> Cc: "Andreas Färber" 
> > >>> Cc: Manivannan Sadhasivam 
> > >>> Cc: Mark Rutland 
> > >>> Cc: linux-arm-ker...@lists.infradead.org
> > >>> Cc: devicet...@vger.kernel.org
> > >>> Signed-off-by: Rob Herring 
> > >>> ---
> > >>> v3:
> > >>> - update MAINTAINERS
> > >>>
> > >>>  .../devicetree/bindings/arm/actions.txt   | 56 ---
> > >>>  .../devicetree/bindings/arm/actions.yaml  | 38 +
> > >>>  MAINTAINERS   |  2 +-
> > >>>  3 files changed, 39 insertions(+), 57 deletions(-)
> > >>>  delete mode 100644 Documentation/devicetree/bindings/arm/actions.txt
> > >>>  create mode 100644 Documentation/devicetree/bindings/arm/actions.yaml
> > >>
> > >> Ping. Please apply or modify this how you'd prefer. I'm not going to
> > >> keep respinning this.
> > >>
> > >
> > > Sorry for that Rob.
> >
> > Well, it was simply not clear whether we were supposed to or not. :)
> 
> I thought 'To' you and a single patch should be clear enough.
> 
> > > Andreas, are you going to take this patch? Else I'll pick it up (If you
> > > want me to do the PR for next cycle)
> >
> > I had checked that all previous changes to the .txt file were by myself,
> > so I would prefer if we not license it under GPLv2-only but under the
> > same dual-license (MIT/GPLv2+) as the DTs. That modification would need
> > Rob's approval then.
> 
> That's fine and dual license is preferred. Can you adjust that when
> applying. Note that the preference for schema is (GPL-2.0 OR
> BSD-2-Clause), but MIT/GPLv2+ is fine by me.

Andreas, are you going to take this patch? Else, we can ask Rob to take
this through his tree as we don't have any queued patches for v5.4 yet.

Thanks,
Mani

> 
> Rob


Re: [PATCH] arm64: perf_event: Add missing header needed for smp_processor_id()

2019-08-20 Thread Mark Rutland
On Tue, Aug 20, 2019 at 04:57:45PM +0100, Raphael Gault wrote:

It would be worth having a body for the commit message like:

| in perf_event.c we use smp_processor_id(), but we haven't included 
|  where it is defined, and rely on this being pulled in 
| via a transitive include. Let's make this more robust by including
|  explciitly.

... and with that, my Acked-by stands.

Thanks,
Mark.

> Acked-by: Mark Rutland 
> Signed-off-by: Raphael Gault 
> ---
>  arch/arm64/kernel/perf_event.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 96e90e270042..24575c0a0065 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL  0xC2
> -- 
> 2.17.1
> 


Re: [PATCH] x86/mm/pti: in pti_clone_pgtable() don't increase addr by PUD_SIZE

2019-08-20 Thread Song Liu



> On Aug 20, 2019, at 7:18 AM, Dave Hansen  wrote:
> 
> On 8/20/19 7:14 AM, Song Liu wrote:
>>> *But*, that shouldn't get hit on a Skylake CPU since those have PCIDs
>>> and shouldn't have a global kernel image.  Could you confirm whether
>>> PCIDs are supported on this CPU?
>> Yes, pcid is listed in /proc/cpuinfo. 
> 
> So what's going on?  Could you confirm exactly which pti_clone_pgtable()
> is causing you problems?  Do you have a theory as to why this manifests
> as a performance problem rather than a functional one?
> 
> A diff of these:
> 
>   /sys/kernel/debug/page_tables/current_user
>   /sys/kernel/debug/page_tables/current_kernel
> 
> before and after your patch might be helpful.

I believe the difference is from the following entries (7 PMDs)

Before the patch:

current_kernel: 0x8100-0x81e04000   14352K ro   
  GLB x  pte
efi:0x8100-0x81e04000   14352K ro   
  GLB x  pte
kernel: 0x8100-0x81e04000   14352K ro   
  GLB x  pte


After the patch:

current_kernel: 0x8100-0x81e0  14M ro   
  PSE GLB x  pmd
efi:0x8100-0x81e0  14M ro   
  PSE GLB x  pmd
kernel: 0x8100-0x81e0  14M ro   
  PSE GLB x  pmd

current_kernel and kernel show same data though. 

Thanks,
Song



Re: [PATCH] userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx

2019-08-20 Thread Andrea Arcangeli
On Tue, Aug 20, 2019 at 06:02:38PM +0200, Oleg Nesterov wrote:
> userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even
> if mm->core_state != NULL.
> 
> Otherwise a page fault can see userfaultfd_missing() == T and use an
> already freed userfaultfd_ctx.
> 
> Reported-by: Kefeng Wang 
> Fixes: 04f5866e41fb ("coredump: fix race condition between 
> mmget_not_zero()/get_task_mm() and core dumping")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Oleg Nesterov 
> ---
>  fs/userfaultfd.c | 25 +
>  1 file changed, 13 insertions(+), 12 deletions(-)

Reviewed-by: Andrea Arcangeli 

Thanks,
Andrea


Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Mark Rutland
On Tue, Aug 20, 2019 at 04:55:24PM +0100, Raphael Gault wrote:
> Hi Mark,
> 
> Thank you for your comments.
> 
> On 8/20/19 4:49 PM, Mark Rutland wrote:
> > On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:
> > > Hi Raphael,
> > > 
> > > On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:
> > > > This feature is required in order to enable PMU counters direct
> > > > access from userspace only when the system is homogeneous.
> > > > This feature checks the model of each CPU brought online and compares it
> > > > to the boot CPU. If it differs then it is heterogeneous.
> > > 
> > > It would be worth noting that this patch prevents heterogeneous CPUs
> > > being brought online late if the system was uniform at boot time.
> > 
> > Looking again, I think I'd misunderstood how
> > ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
> > problem in this area.
> > 
> > [...]
> > 
> > > 
> > > > +   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
> > > > +   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
> > > > ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
> > > > +   .matches = has_heterogeneous_pmu,
> > > > +   },
> > 
> > I had a quick chat with Will, and we concluded that we must permit late
> > onlining of heterogeneous CPUs here as people are likely to rely on
> > late CPU onlining on some heterogeneous systems.
> > 
> > I think the above permits that, but that also means that we need some
> > support code to fail gracefully in that case (e.g. without sending
> > a SIGILL to unaware userspace code).
> 
> I understand, however, I understood that ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU
> did not allow later CPU to be heterogeneous if the capability wasn't already
> enabled.

Yes, I think that you're right. IIUC the absence of
ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU is what prevents that from
happening.

> Thus if as you say we need to allow the system to switch from
> homogeneous to heterogeneous, then I should change the type of this
> capability.

I'm afraid so!

I believe we need both ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU and
ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU, so I guess we should be using
ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE.

Does that sound right to you? ... or have I confused myself again?

Thanks,
Mark.

> > That means that we'll need the counter emulation code that you had in
> > previous versions of this patch (e.g. to handle potential UNDEFs when a
> > new CPU has fewer counters than the previously online CPUs).
> > 
> > Further, I think the context switch (and event index) code needs to take
> > this cap into account, and disable direct access once the system becomes
> > heterogeneous.
> 
> That is a good point indeed.
> 
> Thanks,
> 
> -- 
> Raphael Gault


[PATCH] userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx

2019-08-20 Thread Oleg Nesterov
userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even
if mm->core_state != NULL.

Otherwise a page fault can see userfaultfd_missing() == T and use an
already freed userfaultfd_ctx.

Reported-by: Kefeng Wang 
Fixes: 04f5866e41fb ("coredump: fix race condition between 
mmget_not_zero()/get_task_mm() and core dumping")
Cc: sta...@vger.kernel.org
Signed-off-by: Oleg Nesterov 
---
 fs/userfaultfd.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index ccbdbd6..fe6d804 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -880,6 +880,7 @@ static int userfaultfd_release(struct inode *inode, struct 
file *file)
/* len == 0 means wake all */
struct userfaultfd_wake_range range = { .len = 0, };
unsigned long new_flags;
+   bool still_valid;
 
WRITE_ONCE(ctx->released, true);
 
@@ -895,8 +896,7 @@ static int userfaultfd_release(struct inode *inode, struct 
file *file)
 * taking the mmap_sem for writing.
 */
down_write(>mmap_sem);
-   if (!mmget_still_valid(mm))
-   goto skip_mm;
+   still_valid = mmget_still_valid(mm);
prev = NULL;
for (vma = mm->mmap; vma; vma = vma->vm_next) {
cond_resched();
@@ -907,19 +907,20 @@ static int userfaultfd_release(struct inode *inode, 
struct file *file)
continue;
}
new_flags = vma->vm_flags & ~(VM_UFFD_MISSING | VM_UFFD_WP);
-   prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end,
-new_flags, vma->anon_vma,
-vma->vm_file, vma->vm_pgoff,
-vma_policy(vma),
-NULL_VM_UFFD_CTX);
-   if (prev)
-   vma = prev;
-   else
-   prev = vma;
+   if (still_valid) {
+   prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end,
+new_flags, vma->anon_vma,
+vma->vm_file, vma->vm_pgoff,
+vma_policy(vma),
+NULL_VM_UFFD_CTX);
+   if (prev)
+   vma = prev;
+   else
+   prev = vma;
+   }
vma->vm_flags = new_flags;
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
}
-skip_mm:
up_write(>mmap_sem);
mmput(mm);
 wakeup:
-- 
2.5.0




Re: [PATCH] sched/core: Schedule new worker even if PI-blocked

2019-08-20 Thread Peter Zijlstra
On Tue, Aug 20, 2019 at 05:54:01PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-08-20 17:20:25 [+0200], Peter Zijlstra wrote:

> > And am I right in thinking that that, again, is specific to the
> > sleeping-spinlocks from PREEMPT_RT? Is there really nothing else that
> > identifies those more specifically? It's been a while since I looked at
> > them.
> 
> Not really. I hacked "int sleeping_lock" into task_struct which is
> incremented each time a "sleeping lock" version of rtmutex is requested.
> We have two users as of now:
> - RCU, which checks if we schedule() while holding rcu_read_lock() which
>   is okay if it is a sleeping lock.
> 
> - NOHZ's pending softirq detection while going to idle. It is possible
>   that "ksoftirqd" and "current" are blocked on locks and the CPU goes
>   to idle (because nothing else is runnable) with pending softirqs.
> 
> I wanted to let rtmutex invoke another schedule() function in case of a
> sleeping lock to avoid the RCU warning. This would avoid incrementing
> "sleeping_lock" in the fast path. But then I had no idea what to do with
> the NOHZ thing.

Once upon a time there was also a shadow task->state thing, that was
specific to the sleeping locks, because normally spinlocks don't muck
with task->state and so we have code relying on it not getting trampled.

Can't we use that somewhow? Or is that gone?

> > Also, I suppose it would be really good to put that in a comment.
> So, what does that mean for that patch. According to my inbox it has
> applied to an "urgent" branch. Do I resubmit the whole thing or just a
> comment on top?

Yeah, I'm not sure. I was surprised by that, because afaict all this is
PREEMPT_RT specific and not really /urgent material in the first place.
Ingo?


Re: [PATCH v2 2/2] dt-bindings: lantiq: Update for new SoC

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 3:29 AM Rahul Tanwar
 wrote:
>
> Intel Lightning Mountain(LGM) SoC reuses Lantiq ASC serial controller IP.
> Update the dt bindings to support LGM as well.
>
> Signed-off-by: Rahul Tanwar 
> ---
>  .../devicetree/bindings/serial/lantiq_asc.yaml  | 17 
> +
>  1 file changed, 17 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/serial/lantiq_asc.yaml 
> b/Documentation/devicetree/bindings/serial/lantiq_asc.yaml
> index 54b90490f4fb..92807b59b024 100644
> --- a/Documentation/devicetree/bindings/serial/lantiq_asc.yaml
> +++ b/Documentation/devicetree/bindings/serial/lantiq_asc.yaml
> @@ -17,6 +17,7 @@ properties:
>  oneOf:
>items:
>  - const: lantiq,asc
> +- const: intel,lgm-asc

Better expressed as:

compatible:
  enum:
- intel,lgm-asc
- lantiq,asc

>
>reg:
>  maxItems: 1
> @@ -28,6 +29,12 @@ properties:
>- description: tx or combined interrupt
>- description: rx interrupt
>- description: err interrupt
> +description:
> +  For lantiq,asc compatible, it supports 3 separate
> +  interrupts for tx rx & err. Whereas, for intel,lgm-asc
> +  compatible, it supports combined single interrupt for
> +  all of tx, rx & err interrupts.

This can be expressed with an if/then schema. There's some examples in
the tree how to do that.

> +
>
>clocks:
>  description:
> @@ -67,4 +74,14 @@ examples:
>  interrupts = <112 113 114>;
>  };
>
> +  - |
> +asc0: serial@e0a0 {
> +compatible = "intel,lgm-asc";
> +reg = <0xe0a0 0x1000>;
> +interrupt-parent = <>;
> +interrupts = <128 1>;
> +clocks = < LGM_CLK_NOC4>, < LGM_GCLK_ASC0>;
> +clock-names = "freq", "asc";
> +};
> +
>  ...
> --
> 2.11.0
>


Re: [PATCH v5] ata/pata_buddha: Probe via modalias instead of initcall

2019-08-20 Thread Max Staudt
Hi Bartlomiej,

Thank you very much for your review!

Question below.


On 08/20/2019 02:06 PM, Bartlomiej Zolnierkiewicz wrote:
>> +/* Workaround for X-Surf: Save drvdata in case zorro8390 has set it */
>> +old_drvdata = dev_get_drvdata(>dev);
> 
> This should be done only for type == BOARD_XSURF.

Agreed, as I want to keep unloading functional for Buddha/Catweasel - see below.


>> +static struct zorro_driver pata_buddha_driver = {
>> +.name   = "pata_buddha",
>> +.id_table   = pata_buddha_zorro_tbl,
>> +.probe  = pata_buddha_probe,
>> +.remove = pata_buddha_remove,
> 
> I think that we should also add:
> 
>   .driver  = {
>   .suppress_bind_attrs = true,
>   },
> 
> to prevent the device from being unbinded (and thus ->remove called)
> from the driver using sysfs interface.

Interesting idea - here's my question now:

My intention is to allow remove() for boards where we support IDE only (Buddha, 
Catweasel) - these are autoprobed via zorro_register_driver().
This shouldn't affect the X-Surf case, as it's not autoprobed in this way 
anyway - and thus pata_buddha_driver isn't even used.

Am I missing something? We want to inhibit module unloading (hence no 
module_exit()), but driver unbinding for Buddha/Catweasel should be fine to 
remain, right?


> Please also always check your patches with scripts/checkpatch.pl and
> fix the reported issues:

Apologies, must've been something in my coffee. I will.


Thanks for the review, I'll send a new patch once my question above is resolved.

Max


Re: [PATCH net] ixgbe: fix double clean of tx descriptors with xdp

2019-08-20 Thread Ilya Maximets
On 20.08.2019 18:35, Alexander Duyck wrote:
> On Tue, Aug 20, 2019 at 8:18 AM Ilya Maximets  wrote:
>>
>> Tx code doesn't clear the descriptor status after cleaning.
>> So, if the budget is larger than number of used elems in a ring, some
>> descriptors will be accounted twice and xsk_umem_complete_tx will move
>> prod_tail far beyond the prod_head breaking the comletion queue ring.
>>
>> Fix that by limiting the number of descriptors to clean by the number
>> of used descriptors in the tx ring.
>>
>> Fixes: 8221c5eba8c1 ("ixgbe: add AF_XDP zero-copy Tx support")
>> Signed-off-by: Ilya Maximets 
> 
> I'm not sure this is the best way to go. My preference would be to
> have something in the ring that would prevent us from racing which I
> don't think this really addresses. I am pretty sure this code is safe
> on x86 but I would be worried about weak ordered systems such as
> PowerPC.
> 
> It might make sense to look at adding the eop_desc logic like we have
> in the regular path with a proper barrier before we write it and after
> we read it. So for example we could hold of on writing the bytecount
> value until the end of an iteration and call smp_wmb before we write
> it. Then on the cleanup we could read it and if it is non-zero we take
> an smp_rmb before proceeding further to process the Tx descriptor and
> clearing the value. Otherwise this code is going to just keep popping
> up with issues.

But, unlike regular case, xdp zero-copy xmit and clean for particular
tx ring always happens in the same NAPI context and even on the same
CPU core.

I saw the 'eop_desc' manipulations in regular case and yes, we could
use 'next_to_watch' field just as a flag of descriptor existence,
but it seems unnecessarily complicated. Am I missing something?

> 
>> ---
>>
>> Not tested yet because of lack of available hardware.
>> So, testing is very welcome.
>>
>>  drivers/net/ethernet/intel/ixgbe/ixgbe.h  | 10 ++
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 12 +---
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  |  6 --
>>  3 files changed, 15 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> index 39e73ad60352..0befcef46e80 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
>> @@ -512,6 +512,16 @@ static inline u16 ixgbe_desc_unused(struct ixgbe_ring 
>> *ring)
>> return ((ntc > ntu) ? 0 : ring->count) + ntc - ntu - 1;
>>  }
>>
>> +static inline u64 ixgbe_desc_used(struct ixgbe_ring *ring)
>> +{
>> +   unsigned int head, tail;
>> +
>> +   head = ring->next_to_clean;
>> +   tail = ring->next_to_use;
>> +
>> +   return ((head <= tail) ? tail : tail + ring->count) - head;
>> +}
>> +
>>  #define IXGBE_RX_DESC(R, i)\
>> (&(((union ixgbe_adv_rx_desc *)((R)->desc))[i]))
>>  #define IXGBE_TX_DESC(R, i)\
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index 7882148abb43..d417237857d8 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -1012,21 +1012,11 @@ static u64 ixgbe_get_tx_completed(struct ixgbe_ring 
>> *ring)
>> return ring->stats.packets;
>>  }
>>
>> -static u64 ixgbe_get_tx_pending(struct ixgbe_ring *ring)
>> -{
>> -   unsigned int head, tail;
>> -
>> -   head = ring->next_to_clean;
>> -   tail = ring->next_to_use;
>> -
>> -   return ((head <= tail) ? tail : tail + ring->count) - head;
>> -}
>> -
>>  static inline bool ixgbe_check_tx_hang(struct ixgbe_ring *tx_ring)
>>  {
>> u32 tx_done = ixgbe_get_tx_completed(tx_ring);
>> u32 tx_done_old = tx_ring->tx_stats.tx_done_old;
>> -   u32 tx_pending = ixgbe_get_tx_pending(tx_ring);
>> +   u32 tx_pending = ixgbe_desc_used(tx_ring);
>>
>> clear_check_for_tx_hang(tx_ring);
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c 
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
>> index 6b609553329f..7702efed356a 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
>> @@ -637,6 +637,7 @@ bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector 
>> *q_vector,
>> u32 i = tx_ring->next_to_clean, xsk_frames = 0;
>> unsigned int budget = q_vector->tx.work_limit;
>> struct xdp_umem *umem = tx_ring->xsk_umem;
>> +   u32 used_descs = ixgbe_desc_used(tx_ring);
>> union ixgbe_adv_tx_desc *tx_desc;
>> struct ixgbe_tx_buffer *tx_bi;
>> bool xmit_done;
>> @@ -645,7 +646,7 @@ bool ixgbe_clean_xdp_tx_irq(struct ixgbe_q_vector 
>> *q_vector,
>> tx_desc = IXGBE_TX_DESC(tx_ring, i);
>> i -= tx_ring->count;
>>
>> -   do {
>> +   while (likely(budget && used_descs)) {
>> if (!(tx_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)))

Re: [PATCH 2/2] reset: meson-audio-arb: add sm1 support

2019-08-20 Thread Jerome Brunet
On Tue 20 Aug 2019 at 17:39, Philipp Zabel  wrote:

> Hi Jerome,
>
> thank you for the patch. Just one nitpick and one real issue below:
>
> On Tue, 2019-08-20 at 11:46 +0200, Jerome Brunet wrote:
>> Add the new arb reset lines of the SM1 SoC family
>> 
>> Signed-off-by: Jerome Brunet 
>> ---
>>  drivers/reset/reset-meson-audio-arb.c | 28 ---
>>  1 file changed, 25 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/reset/reset-meson-audio-arb.c 
>> b/drivers/reset/reset-meson-audio-arb.c
>> index c53a2185a039..72d29dbca45a 100644
>> --- a/drivers/reset/reset-meson-audio-arb.c
>> +++ b/drivers/reset/reset-meson-audio-arb.c
>> @@ -30,6 +30,17 @@ static const unsigned int axg_audio_arb_reset_bits[] = {
>>  [AXG_ARB_FRDDR_C]   = 6,
>>  };
>>  
>> +static const unsigned int sm1_audio_arb_reset_bits[] = {
>> +[AXG_ARB_TODDR_A]   = 0,
>> +[AXG_ARB_TODDR_B]   = 1,
>> +[AXG_ARB_TODDR_C]   = 2,
>> +[AXG_ARB_FRDDR_A]   = 4,
>> +[AXG_ARB_FRDDR_B]   = 5,
>> +[AXG_ARB_FRDDR_C]   = 6,
>> +[AXG_ARB_TODDR_D]   = 3,
>> +[AXG_ARB_FRDDR_D]   = 7,
>> +};
>> +
>>  static int meson_audio_arb_update(struct reset_controller_dev *rcdev,
>>unsigned long id, bool assert)
>>  {
>> @@ -82,8 +93,14 @@ static const struct reset_control_ops 
>> meson_audio_arb_rstc_ops = {
>>  };
>>  
>>  static const struct of_device_id meson_audio_arb_of_match[] = {
>> -{ .compatible = "amlogic,meson-axg-audio-arb", },
>> -{}
>> +{
>> +.compatible = "amlogic,meson-axg-audio-arb",
>> +.data = axg_audio_arb_reset_bits,
>> +},
>> +{
>> +.compatible = "amlogic,meson-sm1-audio-arb",
>> +.data = sm1_audio_arb_reset_bits
>> +}, {}
>
> Only slight preference, I would keep the sentinel on a separate line.
> Your choice.

Agreed.

>
>>  };
>>  MODULE_DEVICE_TABLE(of, meson_audio_arb_of_match);
>>  
>> @@ -104,10 +121,15 @@ static int meson_audio_arb_remove(struct 
>> platform_device *pdev)
>>  static int meson_audio_arb_probe(struct platform_device *pdev)
>>  {
>>  struct device *dev = >dev;
>> +const unsigned int *data;
>>  struct meson_audio_arb_data *arb;
>>  struct resource *res;
>>  int ret;
>>  
>> +data = of_device_get_match_data(dev);
>> +if (!data)
>> +return -EINVAL;
>> +
>>  arb = devm_kzalloc(dev, sizeof(*arb), GFP_KERNEL);
>>  if (!arb)
>>  return -ENOMEM;
>> @@ -126,7 +148,7 @@ static int meson_audio_arb_probe(struct platform_device 
>> *pdev)
>>  return PTR_ERR(arb->regs);
>>  
>>  spin_lock_init(>lock);
>> -arb->reset_bits = axg_audio_arb_reset_bits;
>> +arb->reset_bits = data;
>>  arb->rstc.nr_resets = ARRAY_SIZE(axg_audio_arb_reset_bits);
>
> Since SM1 has two more resets, this needs to come from device match data
> as well, or the last two resets will be unusable.

Absolutely. Sorry about that.
We are still a bit early in process of adding the support for this SoC.

I'll wait until I can do more complete tests to send a v2.

>
>>  arb->rstc.ops = _audio_arb_rstc_ops;
>>  arb->rstc.of_node = dev->of_node;
>
> regards
> Philipp


Re: [PATCH 3/6] net: stmmac: sun8i: Use devm_regulator_get for PHY regulator

2019-08-20 Thread Andrew Lunn
On Tue, Aug 20, 2019 at 05:47:14PM +0200, Ondřej Jirman wrote:
> Hi Andrew,
> 
> On Tue, Aug 20, 2019 at 05:39:39PM +0200, Andrew Lunn wrote:
> > On Tue, Aug 20, 2019 at 04:53:40PM +0200, meg...@megous.com wrote:
> > > From: Ondrej Jirman 
> > > 
> > > Use devm_regulator_get instead of devm_regulator_get_optional and rely
> > > on dummy supply. This avoids NULL checks before regulator_enable/disable
> > > calls.
> > 
> > Hi Ondrej
> > 
> > What do you mean by a dummy supply? I'm just trying to make sure you
> > are not breaking backwards compatibility.
> 
> Sorry, I mean dummy regulator. See:
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/regulator/core.c#L1874
> 
> On systems that use DT (i.e. have_full_constraints() == true), when the
> regulator is not found (ENODEV, not specified in DT), regulator_get will 
> return
> a fake dummy regulator that can be enabled/disabled, but doesn't do anything
> real.

Hi Ondrej

But we also gain a new warning:

dev_warn(dev,
 "%s supply %s not found, using dummy regulator\n",
 devname, id);

This regulator is clearly optional, so there should not be a warning.

Maybe you can add a new get_type, OPTIONAL_GET, which does not issue
the warning, but does give back a dummy regulator.

Thanks
Andrew


[PATCH] arm64: perf_event: Add missing header needed for smp_processor_id()

2019-08-20 Thread Raphael Gault
Acked-by: Mark Rutland 
Signed-off-by: Raphael Gault 
---
 arch/arm64/kernel/perf_event.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 96e90e270042..24575c0a0065 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL0xC2
-- 
2.17.1



Re: [PATCH 3/6] net: stmmac: sun8i: Use devm_regulator_get for PHY regulator

2019-08-20 Thread Ondřej Jirman
On Tue, Aug 20, 2019 at 05:39:39PM +0200, Andrew Lunn wrote:
> On Tue, Aug 20, 2019 at 04:53:40PM +0200, meg...@megous.com wrote:
> > From: Ondrej Jirman 
> > 
> > Use devm_regulator_get instead of devm_regulator_get_optional and rely
> > on dummy supply. This avoids NULL checks before regulator_enable/disable
> > calls.
> 
> Hi Ondrej
> 
> What do you mean by a dummy supply? I'm just trying to make sure you
> are not breaking backwards compatibility.

I have tested it on Orange Pi PC 2, that uses only phy-supply, but not
phy-io-supply, and the kernel now prints:

[1.410137] dwmac-sun8i 1c3.ethernet: 1c3.ethernet supply phy-io not 
found, using dummy regulator

I have also tested it on Orange Pi PC, that doesn't use external phy, and
instead of:

[1.081378] dwmac-sun8i 1c3.ethernet: No regulator found

The kernel now prints:

[1.112752] dwmac-sun8i 1c3.ethernet: 1c3.ethernet supply phy not 
found, using dummy regulator
[1.112814] dwmac-sun8i 1c3.ethernet: 1c3.ethernet supply phy-io not 
found, using dummy regulator

Ethernet works in both cases, so that should cover all existing combinations. :)

regards,
Ondrej


>  Thanks
>   Andrew


Re: [PATCH v3 2/5] arm64: cpufeature: Add feature to detect heterogeneous systems

2019-08-20 Thread Raphael Gault

Hi Mark,

Thank you for your comments.

On 8/20/19 4:49 PM, Mark Rutland wrote:

On Tue, Aug 20, 2019 at 04:23:17PM +0100, Mark Rutland wrote:

Hi Raphael,

On Fri, Aug 16, 2019 at 01:59:31PM +0100, Raphael Gault wrote:

This feature is required in order to enable PMU counters direct
access from userspace only when the system is homogeneous.
This feature checks the model of each CPU brought online and compares it
to the boot CPU. If it differs then it is heterogeneous.


It would be worth noting that this patch prevents heterogeneous CPUs
being brought online late if the system was uniform at boot time.


Looking again, I think I'd misunderstood how
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU was dealt with, but we do have a
problem in this area.

[...]




+   .capability = ARM64_HAS_HETEROGENEOUS_PMU,
+   .type = ARM64_CPUCAP_SCOPE_LOCAL_CPU | 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
+   .matches = has_heterogeneous_pmu,
+   },


I had a quick chat with Will, and we concluded that we must permit late
onlining of heterogeneous CPUs here as people are likely to rely on
late CPU onlining on some heterogeneous systems.

I think the above permits that, but that also means that we need some
support code to fail gracefully in that case (e.g. without sending
a SIGILL to unaware userspace code).


I understand, however, I understood that 
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU did not allow later CPU to be 
heterogeneous if the capability wasn't already enabled. Thus if as you 
say we need to allow the system to switch from homogeneous to 
heterogeneous, then I should change the type of this capability.



That means that we'll need the counter emulation code that you had in
previous versions of this patch (e.g. to handle potential UNDEFs when a
new CPU has fewer counters than the previously online CPUs).

Further, I think the context switch (and event index) code needs to take
this cap into account, and disable direct access once the system becomes
heterogeneous.


That is a good point indeed.

Thanks,

--
Raphael Gault


Re: [PATCH v2 1/6] dt-bindings: watchdog: Add YAML schemas for the generic watchdog bindings

2019-08-20 Thread Guenter Roeck

On 8/19/19 11:20 AM, Maxime Ripard wrote:

From: Maxime Ripard 

The watchdogs have a bunch of generic properties that are needed in a
device tree. Add a YAML schemas for those.

Signed-off-by: Maxime Ripard 


What is the target subsystem for this series ? You didn't copy the watchdog
mailing list, so I assume it won't be the watchdog subsystem.

Thanks,
Guenter



---

Changes from v1:
   - New patch
---
  .../bindings/watchdog/watchdog.yaml   | 26 +++
  1 file changed, 26 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/watchdog/watchdog.yaml

diff --git a/Documentation/devicetree/bindings/watchdog/watchdog.yaml 
b/Documentation/devicetree/bindings/watchdog/watchdog.yaml
new file mode 100644
index ..187bf6cb62bf
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/watchdog.yaml
@@ -0,0 +1,26 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/watchdog/watchdog.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Watchdog Generic Bindings
+
+maintainers:
+  - Guenter Roeck 
+  - Wim Van Sebroeck 
+
+description: |
+  This document describes generic bindings which can be used to
+  describe watchdog devices in a device tree.
+
+properties:
+  $nodename:
+pattern: "^watchdog(@.*|-[0-9a-f])?$"
+
+  timeout-sec:
+$ref: /schemas/types.yaml#/definitions/uint32
+description:
+  Contains the watchdog timeout in seconds.
+
+...





Re: [PATCH 3/3] firmware: add mutex fw_lock_fallback for race condition

2019-08-20 Thread Scott Branden

Hi Luis,

I'm glad you are a subject expert in this area.

Some more comments inline.


On 2019-08-19 6:26 p.m., Luis Chamberlain wrote:

On Mon, Aug 19, 2019 at 09:19:51AM -0700, Scott Branden wrote:

To be honest, I find the entire firmware code sloppy.

And that is after years of cleanup on my part. Try going back to v4.1
for instance, check the code out then for an incredible horrific sight :)


I don't think the cache/no-cache feature is
implemented or tested properly nor fallback to begin with.

I'm in total agreement! I *know* there must be holes in that code, and I
acknowledge a few possible gotchas on the commit logs. For instance, I
acknowledged that the firmware cache had a secondary purpose which was
not well documented or understood through commit e44565f62a720
("firmware: fix batched requests - wake all waiters"). The firmware
cache allows for batching requests and sharing the same original request
for multiple consecutive requests which *race against each other*.
That's when I started having my doubts about the architecture of the
firmware cache mechanism, it seemed too complex and perhaps overkill
and considered killing it.


Great (kill it!).  I have no need for cached or batched requests.

The would remove a lot of problems.



As I noted in that commit, the firmware cache is used for:
 
1) Addressing races with file lookups during the suspend/resume cycle by

keeping firmware in memory during the suspend/resume cycle

2) Batched requests for the same file rely only on work from the first
file lookup, which keeps the firmware in memory until the last
release_firmware() is called

Also worth quoting from that commit as well:

"Batched requests *only* take effect if secondary requests come in
prior to the first user calling release_firmware(). The devres name used
for the internal firmware cache is used as a hint other pending requests
are ongoing, the firmware buffer data is kept in memory until the last
user of the buffer calls release_firmware(), therefore serializing
requests and delaying the release until all requests are done."

Later we discovered that the firmware cache had a serious security issue
since its inception through commit 422b3db2a503 ("firmware: Fix security
issue with request_firmware_into_buf()"). Granted, exploiting this would
require the ability to load kernel code, so the vector of exploitation
is rather small.

The cache stuff cannot be removed as it *at least* resolves the fw
suspend stuff, but still, this can likely use a revisit in rachitecture
long term. The second implicit use case for batched requests however
seems complex and not sure if its worth to maintain. I'll note that
at least some drivers *do* their own firmware caching, iwlwifi, is one,
so there is an example there to allow drivers to say "I actually don't
need caching" for the future.

If you're volunteering to cleaning / testing the cache stuff I highly
welcome that.


I would only volunteer to remove it, not test or support it.


  That and the fallback stuff has been needing testing for
years. Someoone was working on patches on the test case for cache stuff
a while ago, from Intel, but they disappeared.
Again, I would only volunteer to remove the fallback mechanism to remove 
added race conditions.

I'm not claiming this patch is the final
solution and indicated such in the cover letter and the comment above.

I missed that sorry.


I hope there is someone more familiar with this code to comment further and
come up with a proper solution.

Alright, I'll dig in and take a look, and propose an alternative.


I have found numerous issues and race conditions with the firmware code (I
simply added a test).

That is nothing compared to the amount of fixes I have found and
actually fixed too, the code was a nightmare before I took on
maintenance.


1) Try loading the same valid firmware using no-cache once it has already
been loaded with cache.

:)


It won't work, which is why I had to use a different filename in the test
for request_firmware_into_buf.

Alright, I'll go try to fix this. Thanks for the report.


I think it's a minor issue compared to the race conditions present.

In reality I don't think anyone will load the same firmware using cache vs.

no-cache.

It's just something I stumbled upon when adding the test case and then 
had to avoid.





2) Try removing the "if (opt_flags & FW_OPT_NOCACHE)" in my patch and always
call the mutex.

The firmware test will lock up during a "no uevent" test.  I am not familiar
with the code to

know why such is true and what issue this exposes in the code.

I hinted in my review of the oops what the issue was.


I don't know if it's the same bug for the "no uevent" test case though?  
The test


just hangs and the kernel oops is not present.  It might be exposing another

underlying issue with the request_firmware code.




3) I have a driver that uses request_firmware_into_buf and have multiple
instances of the driver

Cool, is the driver 

Re: [PATCH v2 1/2] dt-bindings: phy: intel-emmc-phy: Add YAML schema for LGM eMMC PHY

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 5:31 AM Ramuthevar,Vadivel MuruganX
 wrote:
>
> From: Ramuthevar Vadivel Murugan 
>
> Add a YAML schema to use the host controller driver with the
> eMMC PHY on Intel's Lightning Mountain SoC.
>
> Signed-off-by: Ramuthevar Vadivel Murugan 
> 
> ---
> changes in v2:
>   As per Rob Herring review comments, the following updates
>  - change GPL-2.0 -> (GPL-2.0-only OR BSD-2-Clause)
>  - filename is the compatible string plus .yaml
>  - LGM: Lightning Mountain
>  - update maintainer
>  - add intel,syscon under property list
>  - keep one example instead of two
> ---
>  .../bindings/phy/intel,lgm-emmc-phy.yaml   | 72 
> ++
>  1 file changed, 72 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml
>
> diff --git a/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml 
> b/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml
> new file mode 100644
> index ..ec177573aca6
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/intel,lgm-emmc-phy.yaml
> @@ -0,0 +1,72 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/phy/intel,lgm-emmc-phy.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Intel Lightning Mountain(LGM) eMMC PHY Device Tree Bindings
> +
> +maintainers:
> +  - Ramuthevar Vadivel Murugan 
> +
> +
> +description:
> +  -  Add a new compatible to use the host controller driver with the
> + eMMC PHY on Intel's Lightning Mountain SoC.
> +
> +$ref: /schemas/types.yaml#definitions/phandle
> +  description:
> +- It also requires a "syscon" node with compatible = "intel,lgm-chiptop",
> +  "syscon" to access the eMMC PHY register.

Not valid schema. Please build 'make dt_binding_check' and fix any warnings.

> +
> +properties:
> +  "#phy-cells":
> +const: 0
> +
> +  compatible:
> +const: intel,lgm-emmc-phy
> +
> +  reg:
> +maxItems: 1
> +
> +  intel,syscon:
> +items:
> +  - description:
> + - |
> +   e-MMC phy module should include the following properties
> +   * reg, Access the e-MMC, get the base address from syscon.
> +   * reset, reset the e-MMC module.
> +
> +  clocks:
> +items:
> +  - description: e-MMC phy module clock
> +
> +  clock-names:
> +items:
> +  - const: emmcclk
> +
> +  resets:
> +maxItems: 1
> +
> +required:
> +  - "#phy-cells"
> +  - compatible
> +  - reg
> +  - clocks
> +  - clock-names
> +  - resets
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +emmc_phy: emmc_phy {
> +compatible = "intel,lgm-emmc-phy";
> +reg = <0xe002 0x100>;
> +intel,syscon = <>;
> +clocks = <>;
> +clock-names = "emmcclk";
> +#phy-cells = <0>;
> +};
> +
> +...
> --
> 2.11.0
>


Re: [PATCH] MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h

2019-08-20 Thread Peter Zijlstra
On Tue, Aug 20, 2019 at 05:40:11PM +0200, Peter Zijlstra wrote:
> > _ULT
> > _MOBILE
> 
> I suspect these two are the same.

for i in `git grep -l "INTEL_FAM6_.*_MOBILE"`
do
sed -i -e 's/\(INTEL_FAM6_.*\)_MOBILE/\1_ULT/g' ${i}
done

---
 arch/x86/events/intel/core.c  | 16 +++
 arch/x86/events/intel/cstate.c| 14 ++---
 arch/x86/events/intel/rapl.c  |  8 
 arch/x86/events/intel/uncore.c|  6 +++---
 arch/x86/events/msr.c |  6 +++---
 arch/x86/include/asm/intel-family.h   |  8 
 arch/x86/kernel/apic/apic.c   |  4 ++--
 arch/x86/kernel/cpu/bugs.c|  4 ++--
 arch/x86/kernel/cpu/intel.c   |  4 ++--
 drivers/cpufreq/intel_pstate.c|  2 +-
 tools/power/x86/turbostat/turbostat.c | 38 +--
 11 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 76bff3a33725..a0945aa897d6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3978,13 +3978,13 @@ static const struct x86_cpu_desc isolation_ucodes[] = {
INTEL_CPU_DESC(INTEL_FAM6_BROADWELL_X,   2, 0x0b14),
INTEL_CPU_DESC(INTEL_FAM6_SKYLAKE_X, 3, 0x0021),
INTEL_CPU_DESC(INTEL_FAM6_SKYLAKE_X, 4, 0x),
-   INTEL_CPU_DESC(INTEL_FAM6_SKYLAKE_MOBILE,3, 0x007c),
+   INTEL_CPU_DESC(INTEL_FAM6_SKYLAKE_ULT,   3, 0x007c),
INTEL_CPU_DESC(INTEL_FAM6_SKYLAKE,   3, 0x007c),
INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE,  9, 0x004e),
-   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_MOBILE,   9, 0x004e),
-   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_MOBILE,  10, 0x004e),
-   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_MOBILE,  11, 0x004e),
-   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_MOBILE,  12, 0x004e),
+   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_ULT,  9, 0x004e),
+   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_ULT, 10, 0x004e),
+   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_ULT, 11, 0x004e),
+   INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE_ULT, 12, 0x004e),
INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE, 10, 0x004e),
INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE, 11, 0x004e),
INTEL_CPU_DESC(INTEL_FAM6_KABYLAKE, 12, 0x004e),
@@ -4955,9 +4955,9 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_SKYLAKE_X:
pmem = true;
/* fall through */
-   case INTEL_FAM6_SKYLAKE_MOBILE:
+   case INTEL_FAM6_SKYLAKE_ULT:
case INTEL_FAM6_SKYLAKE:
-   case INTEL_FAM6_KABYLAKE_MOBILE:
+   case INTEL_FAM6_KABYLAKE_ULT:
case INTEL_FAM6_KABYLAKE:
x86_add_quirk(intel_pebs_isolation_quirk);
x86_pmu.late_ack = true;
@@ -5005,7 +5005,7 @@ __init int intel_pmu_init(void)
case INTEL_FAM6_ICELAKE_XEON_D:
pmem = true;
/* fall through */
-   case INTEL_FAM6_ICELAKE_MOBILE:
+   case INTEL_FAM6_ICELAKE_ULT:
case INTEL_FAM6_ICELAKE:
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, 
sizeof(hw_cache_event_ids));
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 3854400ad8ff..39c48a0dc6dc 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -608,14 +608,14 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_GT3E,   snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_BROADWELL_X,  snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_MOBILE, snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE,snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_X,  snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_ULT, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE, snb_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_SKYLAKE_X,   snb_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_MOBILE, hswult_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE_ULT, hswult_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_KABYLAKE,hswult_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_CANNONLAKE_MOBILE, cnl_cstates),
+   X86_CSTATES_MODEL(INTEL_FAM6_CANNONLAKE_ULT, cnl_cstates),
 
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNL, knl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_XEON_PHI_KNM, knl_cstates),
@@ -625,8 +625,8 @@ static const struct x86_cpu_id intel_cstates_match[] 
__initconst = {
 
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates),
 
-   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_MOBILE, snb_cstates),
-   X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE,snb_cstates),
+   

Re: [PATCH] sched/core: Schedule new worker even if PI-blocked

2019-08-20 Thread Sebastian Andrzej Siewior
On 2019-08-20 17:20:25 [+0200], Peter Zijlstra wrote:
> > There isc RCU (boosting) and futex. I'm sceptical about the i2c users…
> 
> Well, yes, I too was/am sceptical, but it was tglx who twisted my arm
> and said the i2c people were right and rt_mutex is/should-be a generic
> usable interface.

I don't mind the generic interface I just find the use-case odd. So by
now rtmutex is used by i2c core and not a single driver like it the case
the last time I looked at it. But still, why is it (PI-boosting)
important for I2C to use it and not for other subsystems? Moving on…

> > > > --- a/kernel/sched/core.c
> > > > +++ b/kernel/sched/core.c
> > > > @@ -3945,7 +3945,7 @@ void __noreturn do_task_dead(void)
> > > >  
> > > >  static inline void sched_submit_work(struct task_struct *tsk)
> > > >  {
> > > > -   if (!tsk->state || tsk_is_pi_blocked(tsk))
> > > > +   if (!tsk->state)
> > > > return;
> > > >  
> > > > /*
> 
> So this part actually makes rt_mutex less special and is good.
> 
> > > > @@ -3961,6 +3961,9 @@ static inline void sched_submit_work(str
> > > > preempt_enable_no_resched();
> > > > }
> > > >  
> > > > +   if (tsk_is_pi_blocked(tsk))
> > > > +   return;
> > > > +
> > > > /*
> > > >  * If we are going to sleep and we have plugged IO queued,
> > > >  * make sure to submit it to avoid deadlocks.
> > > 
> > > What do we need that clause for? Why is pi_blocked special _at_all_?
> > 
> > so !RT the scheduler does nothing special if a task blocks on sleeping
> > lock. 
> > If I remember correctly then blk_schedule_flush_plug() is the problem.
> > It may require a lock which is held by the task. 
> > It may hold A and wait for B while another task has B and waits for A. 
> > If my memory does bot betray me then ext+jbd can lockup without this.
> 
> And am I right in thinking that that, again, is specific to the
> sleeping-spinlocks from PREEMPT_RT? Is there really nothing else that
> identifies those more specifically? It's been a while since I looked at
> them.

Not really. I hacked "int sleeping_lock" into task_struct which is
incremented each time a "sleeping lock" version of rtmutex is requested.
We have two users as of now:
- RCU, which checks if we schedule() while holding rcu_read_lock() which
  is okay if it is a sleeping lock.

- NOHZ's pending softirq detection while going to idle. It is possible
  that "ksoftirqd" and "current" are blocked on locks and the CPU goes
  to idle (because nothing else is runnable) with pending softirqs.

I wanted to let rtmutex invoke another schedule() function in case of a
sleeping lock to avoid the RCU warning. This would avoid incrementing
"sleeping_lock" in the fast path. But then I had no idea what to do with
the NOHZ thing.

> Also, I suppose it would be really good to put that in a comment.
So, what does that mean for that patch. According to my inbox it has
applied to an "urgent" branch. Do I resubmit the whole thing or just a
comment on top?

Sebastian


Re: [PATCH v2 1/4] leds: lm3532: Fix brightness control for i2c mode

2019-08-20 Thread Dan Murphy

Pavel

On 8/19/19 5:48 AM, Pavel Machek wrote:

On Tue 2019-08-13 13:11:51, Dan Murphy wrote:

Fix the brightness control for I2C mode.  Instead of
changing the full scale current register update the ALS target
register for the appropriate banks.

In addition clean up some code errors and random misspellings found
during coding.

Tested on Droid4 as well as LM3532 EVM connected to a BeagleBoneBlack

Fixes: e37a7f8d77e1 ("leds: lm3532: Introduce the lm3532 LED driver")
Reported-by: Pavel Machek 
Signed-off-by: Dan Murphy 

I may prefer register renames to come separately, but ...


I can separate them into a different patch

Dan



Acked-by: Pavel Machek 
Pavel


Re: [PATCH v2 1/2] dt-bindings: media: Add YAML schemas for the generic RC bindings

2019-08-20 Thread Rob Herring
On Tue, Aug 20, 2019 at 4:50 AM Maxime Ripard  wrote:
>
> Hi Sean,
>
> On Tue, Aug 20, 2019 at 09:15:26AM +0100, Sean Young wrote:
> > On Mon, Aug 19, 2019 at 08:26:18PM +0200, Maxime Ripard wrote:
> > > From: Maxime Ripard 
> > >
> > > The RC controllers have a bunch of generic properties that are needed in a
> > > device tree. Add a YAML schemas for those.
> > >
> > > Reviewed-by: Rob Herring 
> > > Signed-off-by: Maxime Ripard 
> >
> > For the series (both 1/2 and 2.2):
> >
> > Reviewed-by: Sean Young 
> >
> > How's tree should this go through?
>
> Either yours or Rob's, I guess?

Sean's because there are other changes to
Documentation/devicetree/bindings/media/sunxi-ir.txt in -next.

Rob


Re: [PATCH v2 4/4] leds: lm3532: Add full scale current configuration

2019-08-20 Thread Dan Murphy

Pavel

Thanks for the review

On 8/19/19 5:55 AM, Pavel Machek wrote:

Hi!


Allow the full scale current to be configured at init.
Valid rangles are 5mA->29.8mA.

Signed-off-by: Dan Murphy 
@@ -121,6 +125,7 @@ struct lm3532_als_data {
   * @mode - Mode of the LED string
   * @ctrl_brt_pointer - Zone target register that controls the sink
   * @num_leds - Number of LED strings are supported in this array
+ * @full_scale_current - The full-scale current setting for the current sink.
   * @led_strings - The LED strings supported in this array
   * @label - LED label
   */
@@ -130,8 +135,9 @@ struct lm3532_led {
  
  	int control_bank;

int mode;
-   int ctrl_brt_pointer;
int num_leds;
+   int ctrl_brt_pointer;
+   int full_scale_current;
u32 led_strings[LM3532_MAX_CONTROL_BANKS];
char label[LED_MAX_NAME_SIZE];
  };

No need to move ctrl_brt_pointer... to keep order consistent with docs.


OK I will reset the patches and get rid of that change.  I think this 
got moved when I applied the v1 patch.




+   fs_current_val = led->full_scale_current - LM3532_FS_CURR_MIN /
+LM3532_FS_CURR_STEP;

The computation is wrong ... needs () AFAICT.


Hmm. Doesn't order of operations take precedence?

I will add the () unless checkpatch cribs about them

Dan




Best regards,
Pavel


<    1   2   3   4   5   6   7   8   9   10   >