Re: [PATCH v4 0/3] Fix pinctrl-single pcs_pin_dbg_show()

2021-03-21 Thread Drew Fustini
On Fri, Mar 19, 2021 at 05:21:30PM +0200, Hanna Hawa wrote:
> These patches fix the pcs_pin_dbg_show() function for the scenario where
> a single register controls multiple pins (i.e. bits_per_mux is not zero)
> Additionally, the common formula is moved to a separate function to
> allow reuse.
> 
> Changes since v3:
> -
> - define and set variable 'mux_bytes' in one line
> - update commit message
> 
> Changes since v2:
> -
> - move read() register to be outside of if condition (as it common
>   read()).
> - Remove extra parentheses
> - replace offset variable by direct return statements
> 
> Changes since v1:
> -
> - remove unused variable in In function 'pcs_allocate_pin_table'
>   (Reported-by: kernel test robot )
> 
> Hanna Hawa (3):
>   pinctrl: pinctrl-single: remove unused variable
>   pinctrl: pinctrl-single: remove unused parameter
>   pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is
> not zero
> 
>  drivers/pinctrl/pinctrl-single.c | 65 ++--
>  1 file changed, 37 insertions(+), 28 deletions(-)
> 
> -- 
> 2.17.1
> 

I'm curious what SoC are you using?

It's good to know who has hardware to test bits_per_mux in the future.

I pay attention to pinctrl-single as that is the driver used for the TI
AM3358 SoC used in a variety of BeagleBone boards.  It does not use 
bits_per_mux, but I can verify that this does not cause any regression
for the AM3358 SoC:

  /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single# cat pins
  registered pins: 142
  pin 0 (PIN0) 0:? 44e10800 0027 pinctrl-single
  pin 1 (PIN1) 0:? 44e10804 0027 pinctrl-single
  pin 2 (PIN2) 0:? 44e10808 0027 pinctrl-single
  pin 3 (PIN3) 0:? 44e1080c 0027 pinctrl-single
  pin 4 (PIN4) 0:? 44e10810 0027 pinctrl-single
  pin 5 (PIN5) 0:? 44e10814 0027 pinctrl-single
  pin 6 (PIN6) 0:? 44e10818 0027 pinctrl-single
  pin 7 (PIN7) 0:? 44e1081c 0027 pinctrl-single
  pin 8 (PIN8) 22:gpio-96-127 44e10820 0027 pinctrl-single
  pin 9 (PIN9) 23:gpio-96-127 44e10824 0037 pinctrl-single
  pin 10 (PIN10) 26:gpio-96-127 44e10828 0037 pinctrl-single
  pin 11 (PIN11) 27:gpio-96-127 44e1082c 0037 pinctrl-single
  pin 12 (PIN12) 0:? 44e10830 0037 pinctrl-single
  
  pin 140 (PIN140) 0:? 44e10a30 0028 pinctrl-single
  pin 141 (PIN141) 13:gpio-64-95 44e10a34 0020 pinctrl-single

Reviewed-by: Drew Fustini 

Thanks,
Drew


[PATCH 3/4] spi: mediatek: add mtk_spi_compatible support

2021-03-21 Thread Leilk Liu
this patch adds max_fifo_size and must_rx compat support.

Signed-off-by: Leilk Liu 
---
 drivers/spi/spi-slave-mt27xx.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-slave-mt27xx.c b/drivers/spi/spi-slave-mt27xx.c
index 44edaa360405..7e6fadc88cef 100644
--- a/drivers/spi/spi-slave-mt27xx.c
+++ b/drivers/spi/spi-slave-mt27xx.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 #include 
+#include 
+
 
 #define SPIS_IRQ_EN_REG0x0
 #define SPIS_IRQ_CLR_REG   0x4
@@ -61,8 +63,6 @@
 #define SPIS_DMA_ADDR_EN   BIT(1)
 #define SPIS_SOFT_RST  BIT(0)
 
-#define MTK_SPI_SLAVE_MAX_FIFO_SIZE 512U
-
 struct mtk_spi_slave {
struct device *dev;
void __iomem *base;
@@ -70,10 +70,19 @@ struct mtk_spi_slave {
struct completion xfer_done;
struct spi_transfer *cur_transfer;
bool slave_aborted;
+   const struct mtk_spi_compatible *dev_comp;
 };
 
+struct mtk_spi_compatible {
+   const u32 max_fifo_size;
+   bool must_rx;
+};
+static const struct mtk_spi_compatible mt2712_compat = {
+   .max_fifo_size = 512,
+};
 static const struct of_device_id mtk_spi_slave_of_match[] = {
-   { .compatible = "mediatek,mt2712-spi-slave", },
+   { .compatible = "mediatek,mt2712-spi-slave",
+ .data = (void *)_compat,},
{}
 };
 MODULE_DEVICE_TABLE(of, mtk_spi_slave_of_match);
@@ -272,7 +281,7 @@ static int mtk_spi_slave_transfer_one(struct spi_controller 
*ctlr,
mdata->slave_aborted = false;
mdata->cur_transfer = xfer;
 
-   if (xfer->len > MTK_SPI_SLAVE_MAX_FIFO_SIZE)
+   if (xfer->len > mdata->dev_comp->max_fifo_size)
return mtk_spi_slave_dma_transfer(ctlr, spi, xfer);
else
return mtk_spi_slave_fifo_transfer(ctlr, spi, xfer);
@@ -369,6 +378,7 @@ static int mtk_spi_slave_probe(struct platform_device *pdev)
struct spi_controller *ctlr;
struct mtk_spi_slave *mdata;
int irq, ret;
+   const struct of_device_id *of_id;
 
ctlr = spi_alloc_slave(>dev, sizeof(*mdata));
if (!ctlr) {
@@ -386,7 +396,17 @@ static int mtk_spi_slave_probe(struct platform_device 
*pdev)
ctlr->setup = mtk_spi_slave_setup;
ctlr->slave_abort = mtk_slave_abort;
 
+   of_id = of_match_node(mtk_spi_slave_of_match, pdev->dev.of_node);
+   if (!of_id) {
+   dev_err(>dev, "failed to probe of_node\n");
+   ret = -EINVAL;
+   goto err_put_ctlr;
+   }
mdata = spi_controller_get_devdata(ctlr);
+   mdata->dev_comp = of_id->data;
+
+   if (mdata->dev_comp->must_rx)
+   ctlr->flags = SPI_MASTER_MUST_RX;
 
platform_set_drvdata(pdev, ctlr);
 
-- 
2.18.0



[PATCH 4/4] spi: mediatek: add mt8195 spi slave support

2021-03-21 Thread Leilk Liu
this patch adds mt8195 spi slave compatible support.

Signed-off-by: Leilk Liu 
---
 drivers/spi/spi-slave-mt27xx.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/spi/spi-slave-mt27xx.c b/drivers/spi/spi-slave-mt27xx.c
index 7e6fadc88cef..f199a6c4738a 100644
--- a/drivers/spi/spi-slave-mt27xx.c
+++ b/drivers/spi/spi-slave-mt27xx.c
@@ -77,12 +77,20 @@ struct mtk_spi_compatible {
const u32 max_fifo_size;
bool must_rx;
 };
+
 static const struct mtk_spi_compatible mt2712_compat = {
.max_fifo_size = 512,
 };
+static const struct mtk_spi_compatible mt8195_compat = {
+   .max_fifo_size = 128,
+   .must_rx = true,
+};
+
 static const struct of_device_id mtk_spi_slave_of_match[] = {
{ .compatible = "mediatek,mt2712-spi-slave",
  .data = (void *)_compat,},
+   { .compatible = "mediatek,mt8195-spi-slave",
+ .data = (void *)_compat,},
{}
 };
 MODULE_DEVICE_TABLE(of, mtk_spi_slave_of_match);
-- 
2.18.0



[PATCH 0/4] Add Mediatek MT8195 SPI driver support

2021-03-21 Thread Leilk Liu
This series are based on spi/for-next, and provide 4 patches to add MT8195 spi 
support.

Leilk Liu (4):
  spi: update spi master bindings for MT8195 SoC
  spi: update spi slave bindings for MT8195 SoC
  spi: mediatek: add mtk_spi_compatible support
  spi: mediatek: add mt8195 spi slave support

 .../devicetree/bindings/spi/spi-mt65xx.txt|  1 +
 .../bindings/spi/spi-slave-mt27xx.txt |  1 +
 drivers/spi/spi-slave-mt27xx.c| 36 ---
 3 files changed, 34 insertions(+), 4 deletions(-)

-- 
2.25.1




[PATCH 2/4] spi: update spi slave bindings for MT8195 SoC

2021-03-21 Thread Leilk Liu
Add a DT binding documentation for the MT8195 soc.

Signed-off-by: Leilk Liu 
---
 Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt 
b/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
index c37e5a179b21..9192724540fd 100644
--- a/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
+++ b/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
@@ -3,6 +3,7 @@ Binding for MTK SPI Slave controller
 Required properties:
 - compatible: should be one of the following.
 - mediatek,mt2712-spi-slave: for mt2712 platforms
+- mediatek,mt8195-spi-slave: for mt8195 platforms
 - reg: Address and length of the register set for the device.
 - interrupts: Should contain spi interrupt.
 - clocks: phandles to input clocks.
-- 
2.18.0



[PATCH 1/4] spi: update spi master bindings for MT8195 SoC

2021-03-21 Thread Leilk Liu
Add a DT binding documentation for the MT8195 soc.

Signed-off-by: leilk.liu 
---
 Documentation/devicetree/bindings/spi/spi-mt65xx.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/spi/spi-mt65xx.txt 
b/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
index 7bae7eef26c7..4d0e4c15c4ea 100644
--- a/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
+++ b/Documentation/devicetree/bindings/spi/spi-mt65xx.txt
@@ -12,6 +12,7 @@ Required properties:
 - mediatek,mt8173-spi: for mt8173 platforms
 - mediatek,mt8183-spi: for mt8183 platforms
 - "mediatek,mt8192-spi", "mediatek,mt6765-spi": for mt8192 platforms
+- "mediatek,mt8195-spi", "mediatek,mt6765-spi": for mt8195 platforms
 - "mediatek,mt8516-spi", "mediatek,mt2712-spi": for mt8516 platforms
 - "mediatek,mt6779-spi", "mediatek,mt6765-spi": for mt6779 platforms
 
-- 
2.18.0



RE: [PATCH v3] exfat: speed up iterate/lookup by fixing start point of traversing cluster chain

2021-03-21 Thread Sungjong Seo
> When directory iterate and lookup is called, there's a buggy rewinding of
> start point for traversing cluster chain to the parent directory entry's
> first cluster. This caused repeated cluster chain traversing from the
> first entry of the parent directory that would show worse performance if
> huge amounts of files exist under the parent directory.
> Fix not to rewind, make continue from currently referenced cluster and dir
> entry.
> 
> Tested with 50,000 files under single directory / 256GB sdcard, with
> command "time ls -l > /dev/null",
> Before : 0m08.69s real 0m00.27s user 0m05.91s system
> After  : 0m07.01s real 0m00.25s user 0m04.34s system
> 
> Signed-off-by: Hyeongseok Kim 

Looks good.
Thanks for your contribution.

Reviewed-by: Sungjong Seo 

> ---
>  fs/exfat/dir.c  | 19 +--
>  fs/exfat/exfat_fs.h |  2 +-
>  fs/exfat/namei.c|  9 -
>  3 files changed, 22 insertions(+), 8 deletions(-)



Re: [PATCH v11 6/6] powerpc: Book3S 64-bit outline-only KASAN support

2021-03-21 Thread Daniel Axtens
Balbir Singh  writes:

> On Mon, Mar 22, 2021 at 11:55:08AM +1100, Daniel Axtens wrote:
>> Hi Balbir,
>> 
>> > Could you highlight the changes from
>> > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20170729140901.5887-1-bsinghar...@gmail.com/?
>> >
>> > Feel free to use my signed-off-by if you need to and add/update copyright
>> > headers if appropriate.
>> 
>> There's not really anything in common any more:
>> 
>>  - ppc32 KASAN landed, so there was already a kasan.h for powerpc, the
>>explicit memcpy changes, the support for non-instrumented files,
>>prom_check.sh, etc. all already landed.
>> 
>>  - I locate the shadow region differently and don't resize any virtual
>>memory areas.
>> 
>>  - The ARCH_DEFINES_KASAN_ZERO_PTE handling changed upstream and our
>>handling for that is now handled more by patch 3.
>> 
>>  - The outline hook is now an inline function rather than a #define.
>> 
>>  - The init function has been totally rewritten as it's gone from
>>supporting real mode to not supporting real mode and back.
>> 
>>  - The list of non-instrumented files has grown a lot.
>> 
>>  - There's new stuff: stack walking is now safe, KASAN vmalloc support
>>means modules are better supported now, ptdump works, and there's
>>documentation.
>> 
>> It's been a while now, but I don't think when I started this process 2
>> years ago that I directly reused much of your code. So I'm not sure that
>> a signed-off-by makes sense here? Would a different tag (Originally-by?)
>> make more sense?
>>
>
> Sure

Will do.

>  
>> >> + * The shadow ends before the highest accessible address
>> >> + * because we don't need a shadow for the shadow. Instead:
>> >> + * c00e << 3 + a80e    000 = c00fc000
>> >
>> > The comment has one extra 0 in a80e.., I did the math and had to use
>> > the data from the defines :)
>> 
>> 3 extra 0s, even! Fixed.
>> 
>> >> +void __init kasan_init(void)
>> >> +{
>> >> + /*
>> >> +  * We want to do the following things:
>> >> +  *  1) Map real memory into the shadow for all physical memblocks
>> >> +  * This takes us from c000... to c008...
>> >> +  *  2) Leave a hole over the shadow of vmalloc space. KASAN_VMALLOC
>> >> +  * will manage this for us.
>> >> +  * This takes us from c008... to c00a...
>> >> +  *  3) Map the 'early shadow'/zero page over iomap and vmemmap space.
>> >> +  * This takes us up to where we start at c00e...
>> >> +  */
>> >> +
>> >
>> > assuming we have
>> > #define VMEMMAP_END R_VMEMMAP_END
>> > and ditto for hash we probably need
>> >
>> >BUILD_BUG_ON(VMEMMAP_END + KASAN_SHADOW_OFFSET != KASAN_SHADOW_END);
>> 
>> Sorry, I'm not sure what this is supposed to be testing? In what
>> situation would this trigger?
>>
>
> I am bit concerned that we have hard coded (IIR) 0xa80e... in the
> config, any changes to VMEMMAP_END, KASAN_SHADOW_OFFSET/END
> should be guarded.
>

Ah that makes sense. I'll come up with some test that should catch any
unsynchronised changes to VMEMMAP_END, KASAN_SHADOW_OFFSET or
KASAN_SHADOW_END.

Kind regards,
Daniel Axtens

> Balbir Singh.


[PATCH v2] usb: cdnsp: Fixes issue with dequeuing requests after disabling endpoint

2021-03-21 Thread Pawel Laszczak
From: Pawel Laszczak 

Patch fixes the bug:
BUG: kernel NULL pointer dereference, address: 0050
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 4137 Comm: uvc-gadget Tainted: G   OE 
5.10.0-next-20201214+ #3
Hardware name: ASUS All Series/Q87T, BIOS 0908 07/22/2014
RIP: 0010:cdnsp_remove_request+0xe9/0x530 [cdnsp_udc_pci]
Code: 01 00 00 31 f6 48 89 df e8 64 d4 ff ff 48 8b 43 08 48 8b 13 45 31 f6 48 
89 42 08 48 89 10 b8 98 ff ff ff 48 89 1b 48 89 5b 08 <41> 83 6d 50 01 41 83 af 
d0 00 00 00 01 41 f6 84 24 78 20 00 00 08
RSP: 0018:b68d00d07b60 EFLAGS: 00010046
RAX: ff98 RBX: 9d29c57fbf00 RCX: 1400
RDX: 9d29c57fbf00 RSI:  RDI: 9d29c57fbf00
RBP: b68d00d07bb0 R08: 9d2ad9510a00 R09: 9d2ac011c000
R10: 9d2a12b6e760 R11:  R12: 9d29d3fb8000
R13:  R14:  R15: 9d29d3fb88c0
FS:  () GS:9d2adba0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0050 CR3: 000102164005 CR4: 001706f0
Call Trace:
 cdnsp_ep_dequeue+0x3c/0x90 [cdnsp_udc_pci]
 cdnsp_gadget_ep_dequeue+0x3f/0x80 [cdnsp_udc_pci]
 usb_ep_dequeue+0x21/0x70 [udc_core]
 uvcg_video_enable+0x19d/0x220 [usb_f_uvc]
 uvc_v4l2_release+0x49/0x90 [usb_f_uvc]
 v4l2_release+0xa5/0x100 [videodev]
 __fput+0x99/0x250
 fput+0xe/0x10
 task_work_run+0x75/0xb0
 do_exit+0x370/0xb80
 do_group_exit+0x43/0xa0
 get_signal+0x12d/0x820
 arch_do_signal_or_restart+0xb2/0x870
 ? __switch_to_asm+0x36/0x70
 ? kern_select+0xc6/0x100
 exit_to_user_mode_prepare+0xfc/0x170
 syscall_exit_to_user_mode+0x2a/0x40
 do_syscall_64+0x43/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe969cf5dd7
Code: Unable to access opcode bytes at RIP 0x7fe969cf5dad.

Problem occurs for UVC class. During disconnecting the UVC class disable
endpoints and then start dequeuing all requests. This leads to situation
where requests are removed twice. The first one in
cdnsp_gadget_ep_disable and the second in cdnsp_gadget_ep_dequeue
function.
Patch adds condition in cdnsp_gadget_ep_dequeue function which allows
dequeue requests only from enabled endpoint.

Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD 
Driver")
Signed-off-by: Pawel Laszczak 

---
Changelog:
v2:
- removed unexpected 'commit' word from fixes tag

 drivers/usb/cdns3/cdnsp-gadget.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index f2ebbacd932e..d7d4bdd57f46 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -1128,6 +1128,10 @@ static int cdnsp_gadget_ep_dequeue(struct usb_ep *ep,
return -ESHUTDOWN;
}
 
+   /* Requests has been dequeued during disabling endpoint. */
+   if (!(pep->ep_state & EP_ENABLED))
+   return 0;
+
spin_lock_irqsave(>lock, flags);
ret = cdnsp_ep_dequeue(pep, to_cdnsp_request(request));
spin_unlock_irqrestore(>lock, flags);
-- 
2.25.1



Re: [RFC PATCH v2 00/11] bfq: introduce bfq.ioprio for cgroup

2021-03-21 Thread brookxu



Paolo Valente wrote on 2021/3/21 19:04:
> 
> 
>> Il giorno 12 mar 2021, alle ore 12:08, brookxu  ha 
>> scritto:
>>
>> From: Chunguang Xu 
>>
> 
> Hi Chunguang,
> 
>> Tasks in the production environment can be roughly divided into
>> three categories: emergency tasks, ordinary tasks and offline
>> tasks. Emergency tasks need to be scheduled in real time, such
>> as system agents. Offline tasks do not need to guarantee QoS,
>> but can improve system resource utilization during system idle
>> periods, such as background tasks. The above requirements need
>> to achieve IO preemption. At present, we can use weights to
>> simulate IO preemption, but since weights are more of a shared
>> concept, they cannot be simulated well. For example, the weights
>> of emergency tasks and ordinary tasks cannot be determined well,
>> offline tasks (with the same weight) actually occupy different
>> resources on disks with different performance, and the tail
>> latency caused by offline tasks cannot be well controlled. Using
>> ioprio's concept of preemption, we can solve the above problems
>> very well. Since ioprio will eventually be converted to weight,
>> using ioprio alone can also achieve weight isolation within the
>> same class. But we can still use bfq.weight to control resource,
>> achieving better IO Qos control.
>>
>> However, currently the class of bfq_group is always be class, and
>> the ioprio class of the task can only be reflected in a single
>> cgroup. We cannot guarantee that real-time tasks in a cgroup are
>> scheduled in time. Therefore, we introduce bfq.ioprio, which
>> allows us to configure ioprio class for cgroup. In this way, we
>> can ensure that the real-time tasks of a cgroup can be scheduled
>> in time. Similarly, the processing of offline task groups can
>> also be simpler.
>>
> 
> I find this contribution very interesting.  Anyway, given the
> relevance of such a contribution, I'd like to hear from relevant
> people (Jens, Tejun, ...?), before revising individual patches.
> 
> Yet I already have a general question.  How does this mechanism comply
> with per-process ioprios and ioprio classes?  For example, what
> happens if a process belongs to BE-class group according to your
> mechanism, but to a RT class according to its ioprio?  Does the
> pre-group class dominate the per-process class?  Is all clean and
> predictable?
Hi Paolo, thanks for your precious time. This is a good question. Now
the pre-group class dominate the per-process class. But thinking about
it in depth now, there seems to be a problem in the container scene,
because the tasks inside the container may have different ioprio class
and ioprio. Maybe Bfq.ioprio should only affects the scheduling of the
group? which can be better compatible with the actual production
environment.

>> The bfq.ioprio interface now is available for cgroup v1 and cgroup
>> v2. Users can configure the ioprio for cgroup through this interface,
>> as shown below:
>>
>> echo "1 2"> blkio.bfq.ioprio
> 
> Wouldn't it be nicer to have acronyms for classes (RT, BE, IDLE),
> instead of numbers?

As ioprio is a number, so the ioprio class also uses a number form.
But your suggestion is good. If necessary, I will modify it later.

> 
> Thank you very much for this improvement proposal,

More discussions are welcome, Thanks.

> Paolo
> 
>>
>> The above two values respectively represent the values of ioprio
>> class and ioprio for cgroup. The ioprio of tasks within the cgroup
>> is uniformly equal to the ioprio of the cgroup. If the ioprio of
>> the cgroup is disabled, the ioprio of the task remains the same,
>> usually from io_context.
>>
>> When testing, using fio and fio_generate_plots we can clearly see
>> that the IO delay of the task satisfies RT> BE> IDLE. When RT is
>> running, BE and IDLE are guaranteed minimum bandwidth. When used
>> with bfq.weight, we can also isolate the resource within the same
>> class.
>>
>> The test process is as follows:
>> # prepare data disk
>> mount /dev/sdb /data1
>>
>> # create cgroup v1 hierarchy
>> cd /sys/fs/cgroup/blkio
>> mkdir rt be idle
>> echo "1 0" > rt/blkio.bfq.ioprio
>> echo "2 0" > be/blkio.bfq.ioprio
>> echo "3 0" > idle/blkio.bfq.ioprio
>>
>> # run fio test
>> fio fio.ini
>>
>> # generate svg graph
>> fio_generate_plots res
>>
>> The contents of fio.ini are as follows:
>> [global]
>> ioengine=libaio
>> group_reporting=1
>> log_avg_msec=500
>> direct=1
>> time_based=1
>> iodepth=16
>> size=100M
>> rw=write
>> bs=1M
>> [rt]
>> name=rt
>> write_bw_log=rt
>> write_lat_log=rt
>> write_iops_log=rt
>> filename=/data1/rt.bin
>> cgroup=rt
>> runtime=30s
>> nice=-10
>> [be]
>> name=be
>> new_group
>> write_bw_log=be
>> write_lat_log=be
>> write_iops_log=be
>> filename=/data1/be.bin
>> cgroup=be
>> runtime=60s
>> [idle]
>> name=idle
>> new_group
>> write_bw_log=idle
>> write_lat_log=idle
>> write_iops_log=idle
>> filename=/data1/idle.bin
>> cgroup=idle
>> runtime=90s
>>
>> V2:
>> 1. Optmise bfq_select_next_class().

Re: [selftests] e48d82b67a: BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten

2021-03-21 Thread Oliver Sang
Hi Vlastimil,

On Wed, Mar 17, 2021 at 12:29:40PM +0100, Vlastimil Babka wrote:
> On 3/17/21 9:36 AM, kernel test robot wrote:
> > 
> > 
> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-9):
> > 
> > commit: e48d82b67a2b760eedf7b95ca15f41267496386c ("[PATCH 1/2] selftests: 
> > add a kselftest for SLUB debugging functionality")
> > url: 
> > https://github.com/0day-ci/linux/commits/glittao-gmail-com/selftests-add-a-kselftest-for-SLUB-debugging-functionality/20210316-204257
> > base: 
> > https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
> > 
> > in testcase: trinity
> > version: trinity-static-i386-x86_64-f93256fb_2019-08-28
> > with following parameters:
> > 
> > group: group-04
> > 
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
> > 
> > 
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire 
> > log/backtrace):
> > 
> > 
> > +---+---++
> > |   
> > | v5.12-rc2 | e48d82b67a |
> > +---+---++
> > | BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten
> > | 0 | 69 |
> > | INFO:0x(ptrval)-0x(ptrval)@offset=#.First_byte#instead_of 
> > | 0 | 69 |
> > | INFO:Allocated_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | INFO:Slab0x(ptrval)objects=#used=#fp=0x(ptrval)flags= 
> > | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset=#fp=0x(ptrval)   
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Freechain_corrupt 
> > | 0 | 69 |
> > | INFO:Freed_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Wrong_object_count.Counter_is#but_counted_were
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Redzone_overwritten   
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Objects_remaining_in_TestSlub_next_ptr_free_on__kmem_cache_shutdown()
> >  | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset= 
> > | 0 | 69 |
> > | BUG_TestSlub_1th_word_free(Tainted:G_B):Poison_overwritten
> > | 0 | 69 |
> > | BUG_TestSlub_50th_word_free(Tainted:G_B):Poison_overwritten   
> > | 0 | 69 |
> > | BUG_TestSlub_RZ_free(Tainted:G_B):Redzone_overwritten 
> > | 0 | 69 |
> > +---+---++
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot 
> > 
> > 
> > 
> > [   22.154049] random: get_random_u32 called from 
> > __kmem_cache_create+0x23/0x3e0 with crng_init=0 
> > [   22.154070] random: get_random_u32 called from 
> > cache_random_seq_create+0x7c/0x140 with crng_init=0 
> > [   22.154167] random: get_random_u32 called from allocate_slab+0x155/0x5e0 
> > with crng_init=0 
> > [   22.154690] test_slub: 1. kmem_cache: Clobber Redzone 0x12->0x(ptrval)
> > [   22.164499] 
> > =
> > [   22.166629] BUG TestSlub_RZ_alloc (Not tainted): Redzone overwritten
> > [   22.168179] 
> > -
> > [   22.168179]
> > [   22.168372] Disabling lock debugging due to kernel taint
> > [   22.168372] INFO: 0x(ptrval)-0x(ptrval) @offset=1064. First byte 0x12 
> > instead of 0xcc
> > [   22.168372] INFO: Allocated in resiliency_test+0x47/0x1be age=3 cpu=0 
> > pid=1 
> > [   22.168372] __slab_alloc+0x57/0x80 
> > [   22.168372] kmem_cache_alloc (kbuild/src/consumer/mm/slub.c:2871 
> > kbuild/src/consumer/mm/slub.c:2915 kbuild/src/consumer/mm/slub.c:2920) 
> > [   

[PATCH] drm/imx: ipuv3-plane: Remove two unnecessary export symbols

2021-03-21 Thread Liu Ying
The ipu_plane_disable_deferred() and ipu_plane_assign_pre() functions have
not been used by any other modules but only imxdrm itself internally since
imxdrm and imx-ipuv3-crtc were merged in one module. So, this patch removes
export symbols for the two functions.

Fixes: 3d1df96ad468 (drm/imx: merge imx-drm-core and ipuv3-crtc in one module)
Signed-off-by: Liu Ying 
---
 drivers/gpu/drm/imx/ipuv3-plane.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c 
b/drivers/gpu/drm/imx/ipuv3-plane.c
index 0755080..4bd39bb 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -264,7 +264,6 @@ void ipu_plane_disable_deferred(struct drm_plane *plane)
ipu_plane_disable(ipu_plane, false);
}
 }
-EXPORT_SYMBOL_GPL(ipu_plane_disable_deferred);
 
 static void ipu_plane_state_reset(struct drm_plane *plane)
 {
@@ -813,7 +812,6 @@ int ipu_planes_assign_pre(struct drm_device *dev,
 
return 0;
 }
-EXPORT_SYMBOL_GPL(ipu_planes_assign_pre);
 
 struct ipu_plane *ipu_plane_init(struct drm_device *dev, struct ipu_soc *ipu,
 int dma, int dp, unsigned int possible_crtcs,
-- 
2.7.4



[PATCH v9] i2c: virtio: add a virtio i2c frontend driver

2021-03-21 Thread Jie Deng
Add an I2C bus driver for virtio para-virtualization.

The controller can be emulated by the backend driver in
any device model software by following the virtio protocol.

The device specification can be found on
https://lists.oasis-open.org/archives/virtio-comment/202101/msg8.html.

By following the specification, people may implement different
backend drivers to emulate different controllers according to
their needs.

Co-developed-by: Conghui Chen 
Signed-off-by: Conghui Chen 
Signed-off-by: Jie Deng 
---
Changes in v9:
- Remove the virtio_adapter and update its members in probe.
- Refined the virtio_i2c_complete_reqs for buf free.

Changes in v8:
- Make virtio_i2c.adap a pointer.
- Mark members in virtio_i2c_req with cacheline_aligned.
 
Changes in v7:
- Remove unused headers.
- Update Makefile and Kconfig.
- Add the cleanup after completing reqs.
- Avoid memcpy for data marked with I2C_M_DMA_SAFE.
- Fix something reported by kernel test robot.

Changes in v6:
- Move struct virtio_i2c_req into the driver.
- Use only one buf in struct virtio_i2c_req.

Changes in v5:
- The first version based on the acked specification.

 drivers/i2c/busses/Kconfig  |  11 ++
 drivers/i2c/busses/Makefile |   3 +
 drivers/i2c/busses/i2c-virtio.c | 286 
 include/uapi/linux/virtio_i2c.h |  40 ++
 include/uapi/linux/virtio_ids.h |   1 +
 5 files changed, 341 insertions(+)
 create mode 100644 drivers/i2c/busses/i2c-virtio.c
 create mode 100644 include/uapi/linux/virtio_i2c.h

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 05ebf75..cb8d0d8 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -21,6 +21,17 @@ config I2C_ALI1535
  This driver can also be built as a module.  If so, the module
  will be called i2c-ali1535.
 
+config I2C_VIRTIO
+   tristate "Virtio I2C Adapter"
+   select VIRTIO
+   help
+ If you say yes to this option, support will be included for the virtio
+ I2C adapter driver. The hardware can be emulated by any device model
+ software according to the virtio protocol.
+
+ This driver can also be built as a module. If so, the module
+ will be called i2c-virtio.
+
 config I2C_ALI1563
tristate "ALI 1563"
depends on PCI
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 615f35e..efdd3f3 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -145,4 +145,7 @@ obj-$(CONFIG_I2C_XGENE_SLIMPRO) += i2c-xgene-slimpro.o
 obj-$(CONFIG_SCx200_ACB)   += scx200_acb.o
 obj-$(CONFIG_I2C_FSI)  += i2c-fsi.o
 
+# VIRTIO I2C host controller driver
+obj-$(CONFIG_I2C_VIRTIO)   += i2c-virtio.o
+
 ccflags-$(CONFIG_I2C_DEBUG_BUS) := -DDEBUG
diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
new file mode 100644
index 000..316986e
--- /dev/null
+++ b/drivers/i2c/busses/i2c-virtio.c
@@ -0,0 +1,286 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Virtio I2C Bus Driver
+ *
+ * The Virtio I2C Specification:
+ * 
https://raw.githubusercontent.com/oasis-tcs/virtio-spec/master/virtio-i2c.tex
+ *
+ * Copyright (c) 2021 Intel Corporation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**
+ * struct virtio_i2c - virtio I2C data
+ * @vdev: virtio device for this controller
+ * @completion: completion of virtio I2C message
+ * @adap: I2C adapter for this controller
+ * @i2c_lock: lock for virtqueue processing
+ * @vq: the virtio virtqueue for communication
+ */
+struct virtio_i2c {
+   struct virtio_device *vdev;
+   struct completion completion;
+   struct i2c_adapter adap;
+   struct mutex lock;
+   struct virtqueue *vq;
+};
+
+/**
+ * struct virtio_i2c_req - the virtio I2C request structure
+ * @out_hdr: the OUT header of the virtio I2C message
+ * @buf: the buffer into which data is read, or from which it's written
+ * @in_hdr: the IN header of the virtio I2C message
+ */
+struct virtio_i2c_req {
+   struct virtio_i2c_out_hdr out_hdr   cacheline_aligned;
+   uint8_t *bufcacheline_aligned;
+   struct virtio_i2c_in_hdr in_hdr cacheline_aligned;
+};
+
+static void virtio_i2c_msg_done(struct virtqueue *vq)
+{
+   struct virtio_i2c *vi = vq->vdev->priv;
+
+   complete(>completion);
+}
+
+static int virtio_i2c_send_reqs(struct virtqueue *vq,
+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr)
+{
+   struct scatterlist *sgs[3], out_hdr, msg_buf, in_hdr;
+   int i, outcnt, incnt, err = 0;
+
+   for (i = 0; i < nr; i++) {
+   if (!msgs[i].len)
+   break;
+
+   

[PATCH V4 09/10] x86/pks: Add PKS kernel API

2021-03-21 Thread ira . weiny
From: Fenghua Yu 

PKS allows kernel users to define domains of page mappings which have
additional protections beyond the paging protections.

Add an API to allocate, use, and free a protection key which identifies
such a domain.  Export 5 new symbols pks_key_alloc(), pks_mk_noaccess(),
pks_mk_readonly(), pks_mk_readwrite(), and pks_key_free().  Add 2 new
macros; PAGE_KERNEL_PKEY(key) and _PAGE_PKEY(pkey).

Update the protection key documentation to cover pkeys on supervisor
pages.

Reviewed-by: Dan Williams 
Co-developed-by: Ira Weiny 
Signed-off-by: Ira Weiny 
Signed-off-by: Fenghua Yu 

---
Changes from V3:
From Dan Williams
Remove flags from pks_key_alloc()
Convert to ARCH_ENABLE_SUPERVISOR_PKEYS
remove export of update_pkey_val()
Update documentation
change __clear_bit to clear_bit_unlock
remove cpu_feature_enabled from pks_key_free
remove pr_err stubs when CONFIG_HAS_SUPERVISOR_PKEYS=n
clarify pks_key_alloc flags parameter with enum
Update documentation for ARCH_ENABLE_SUPERVISOR_PKEYS
No need to export write_pkrs
Correct Kernel Doc for API functions
From Randy Dunlap:
Fix grammatical errors in doc

Changes from V2
From Greg KH
Replace all WARN_ON_ONCE() uses with pr_err()
From Dan Williams
Add __must_check to pks_key_alloc() to help ensure users
are using the API correctly

Changes from V1
Per Dave Hansen
Add flags to pks_key_alloc() to help future proof the
interface if/when the key space is exhausted.

Changes from RFC V3
Per Dave Hansen
Put WARN_ON_ONCE in pks_key_free()
s/pks_mknoaccess/pks_mk_noaccess/
s/pks_mkread/pks_mk_readonly/
s/pks_mkrdwr/pks_mk_readwrite/
Change return pks_key_alloc() to EOPNOTSUPP when not
supported or configured
Per Peter Zijlstra
Remove unneeded preempt disable/enable
---
 Documentation/core-api/protection-keys.rst | 108 +---
 arch/x86/include/asm/pgtable_types.h   |  12 ++
 arch/x86/include/asm/pks.h |   4 +
 arch/x86/mm/pkeys.c| 137 -
 include/linux/pgtable.h|   4 +
 include/linux/pkeys.h  |  17 +++
 6 files changed, 263 insertions(+), 19 deletions(-)

diff --git a/Documentation/core-api/protection-keys.rst 
b/Documentation/core-api/protection-keys.rst
index ec575e72d0b2..6d6c4f25080c 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -4,25 +4,30 @@
 Memory Protection Keys
 ==
 
-Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
-which is found on Intel's Skylake (and later) "Scalable Processor"
-Server CPUs. It will be available in future non-server Intel parts
-and future AMD processors.
+Memory Protection Keys provide a mechanism for enforcing page-based
+protections, but without requiring modification of the page tables
+when an application changes protection domains.
 
-For anyone wishing to test or use this feature, it is available in
-Amazon's EC2 C5 instances and is known to work there using an Ubuntu
-17.04 image.
+PKeys Userspace (PKU) is a feature which is found on Intel's Skylake "Scalable
+Processor" Server CPUs and later.  And it will be available in future
+non-server Intel parts and future AMD processors.
 
-Memory Protection Keys provides a mechanism for enforcing page-based
-protections, but without requiring modification of the page tables
-when an application changes protection domains.  It works by
-dedicating 4 previously ignored bits in each page table entry to a
-"protection key", giving 16 possible keys.
+Protection Keys for Supervisor pages (PKS) is available in the SDM since May
+2020.
+
+pkeys work by dedicating 4 previously Reserved bits in each page table entry to
+a "protection key", giving 16 possible keys.  User and Supervisor pages are
+treated separately.
 
-There is also a new user-accessible register (PKRU) with two separate
-bits (Access Disable and Write Disable) for each key.  Being a CPU
-register, PKRU is inherently thread-local, potentially giving each
-thread a different set of protections from every other thread.
+Protections for each page are controlled with per-CPU registers for each type
+of page User and Supervisor.  Each of these 32-bit register stores two separate
+bits (Access Disable and Write Disable) for each key.
+
+For Userspace the register is user-accessible (rdpkru/wrpkru).  For
+Supervisor, the register (MSR_IA32_PKRS) is accessible only to the kernel.
+
+Being a CPU register, pkeys are inherently thread-local, potentially giving
+each thread an independent set of protections from 

[PATCH V4 07/10] x86/pks: Preserve the PKRS MSR on context switch

2021-03-21 Thread ira . weiny
From: Ira Weiny 

The PKRS MSR is defined as a per-logical-processor register.  This
isolates memory access by logical CPU.  Unfortunately, the MSR is not
managed by XSAVE.  Therefore, tasks must save/restore the MSR value on
context switch.

Define a saved PKRS value in the task struct, as well as a cached
per-logical-processor MSR value which mirrors the MSR value of the
current CPU.  Initialize all tasks with the default MSR value.  Then, on
schedule in, call write_pkrs() which automatically avoids the overhead
of the MSR write if possible.

Reviewed-by: Dan Williams 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Signed-off-by: Ira Weiny 

---
Changes from V3
From Dan Williams
make pks_init_task() and pks_sched_in() macros
To avoid Supervisor PKey '#ifdefery' in process.c and
process_64.c
Use ARCH_ENABLE_SUPERVISOR_PKEYS
Split write_pkrs() to an earlier patch to be used in setup_pks()
Move Peter's authorship to that patch.
Remove kernel doc comment from write_pkrs
From Thomas Gleixner
Fix where pks_sched_in() is called from.
Should be called from __switch_to()
NOTE: PKS requires x86_64 so there is no need to
update process_32.c
Make pkrs_cache static
Remove unnecessary pkrs_cache declaration
Clean up formatting

Changes from V2
Adjust for PKS enable being final patch.

Changes from V1
Rebase to latest tip/master
Resolve conflicts with INIT_THREAD changes

Changes since RFC V3
Per Dave Hansen
Update commit message
move saved_pkrs to be in a nicer place
Per Peter Zijlstra
Add Comment from Peter
Clean up white space
Update authorship
---
 arch/x86/include/asm/msr-index.h|  1 +
 arch/x86/include/asm/pkeys_common.h | 14 ++
 arch/x86/include/asm/processor.h| 43 -
 arch/x86/kernel/process.c   |  3 ++
 arch/x86/kernel/process_64.c|  2 ++
 5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ecf0a35..c15a049bf6ac 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -765,6 +765,7 @@
 
 #define MSR_IA32_TSC_DEADLINE  0x06E0
 
+#define MSR_IA32_PKRS  0x06E1
 
 #define MSR_TSX_FORCE_ABORT0x010F
 
diff --git a/arch/x86/include/asm/pkeys_common.h 
b/arch/x86/include/asm/pkeys_common.h
index 0681522974ba..6917f1a27479 100644
--- a/arch/x86/include/asm/pkeys_common.h
+++ b/arch/x86/include/asm/pkeys_common.h
@@ -17,4 +17,18 @@
 #define PKR_AD_KEY(pkey) (PKR_AD_BIT << PKR_PKEY_SHIFT(pkey))
 #define PKR_WD_KEY(pkey) (PKR_WD_BIT << PKR_PKEY_SHIFT(pkey))
 
+/*
+ * Define a default PKRS value for each task.
+ *
+ * Key 0 has no restriction.  All other keys are set to the most restrictive
+ * value which is access disabled (AD=1).
+ *
+ * NOTE: This needs to be a macro to be used as part of the INIT_THREAD macro.
+ */
+#define INIT_PKRS_VALUE (PKR_AD_KEY(1) | PKR_AD_KEY(2) | PKR_AD_KEY(3) | \
+PKR_AD_KEY(4) | PKR_AD_KEY(5) | PKR_AD_KEY(6) | \
+PKR_AD_KEY(7) | PKR_AD_KEY(8) | PKR_AD_KEY(9) | \
+PKR_AD_KEY(10) | PKR_AD_KEY(11) | PKR_AD_KEY(12) | \
+PKR_AD_KEY(13) | PKR_AD_KEY(14) | PKR_AD_KEY(15))
+
 #endif /*_ASM_X86_PKEYS_COMMON_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index dc6d149bf851..b7ae396285dd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -18,6 +18,7 @@ struct vm86;
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -519,6 +520,12 @@ struct thread_struct {
unsigned long   cr2;
unsigned long   trap_nr;
unsigned long   error_code;
+
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+   /* Saved Protection key register for supervisor mappings */
+   u32 saved_pkrs;
+#endif
+
 #ifdef CONFIG_VM86
/* Virtual 86 mode info */
struct vm86 *vm86;
@@ -784,7 +791,41 @@ static inline void spin_lock_prefetch(const void *x)
 #define KSTK_ESP(task) (task_pt_regs(task)->sp)
 
 #else
-#define INIT_THREAD { }
+
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+#define INIT_THREAD_PKRS   .saved_pkrs = INIT_PKRS_VALUE
+
+void write_pkrs(u32 new_pkrs);
+
+/*
+ * Define pks_init_task and pks_sched_in as macros to avoid requiring the
+ * definition of struct task_struct in this header while keeping the supervisor
+ * pkey #ifdefery out of process.c and process_64.c
+ */
+
+/*
+ * New tasks get the most restrictive 

[PATCH V4 10/10] x86/pks: Add PKS test code

2021-03-21 Thread ira . weiny
From: Ira Weiny 

The core PKS functionality provides an interface for kernel users to
reserve keys to their domains set up the page tables with those keys and
control access to those domains when needed.

Define test code which exercises the core functionality of PKS via a
debugfs entry.  Basic checks can be triggered on boot with a kernel
command line option while both basic and preemption checks can be
triggered with separate debugfs values.

debugfs controls are:

'0' -- Run access tests with a single pkey
'1' -- Set up the pkey register with no access for the pkey allocated to
   this fd
'2' -- Check that the pkey register updated in '1' is still the same.
   (To be used after a forced context switch.)
'3' -- Allocate all pkeys possible and run tests on each pkey allocated.
   DEFAULT when run at boot.

Closing the fd will cleanup and release the pkey, therefore to exercise
context switch testing a user space program is provided in:

.../tools/testing/selftests/x86/test_pks.c

Reviewed-by: Dan Williams 
Reviewed-by: Dave Hansen 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Signed-off-by: Ira Weiny 

---
Changes from V3
Add test into ARCH_ENABLE_SUPERVISOR_PKEYS
Fix allocate context error handling
Callback must now take pt_regs instead of irq_state
Use pipes to ensure code switches contexts
Add more verbose output
Add --debug opt to trigger more kernel debug output
Reduce kernel output by default
Use #defines for the various options
Add ability to chose cpu for testing
Work out how to make pkrs_cache global when CONFIG_PKS_TEST=y
Comments from Dan Williams:
Remove walk_table in favor of follow_pte
Adjust for new MASK and SHIFT macros
Remove unneeded pkey.h header
Handle_pks_testing -> handle_pks_test
Retain static pkrs_cache when not test
s/PKS_TESTING/PKS_TEST/
Put pks_test_callback declaration in pks_common.h
Don't export pks_test_callback
Add comment explaining context creation
Remove module boilerplate

Changes for V2
Fix compilation errors

Changes for V1
Update for new pks_key_alloc()

Changes from RFC V3
Comments from Dave Hansen
clean up whitespace dmanage
Clean up Kconfig help
Clean up user test error output
s/pks_mknoaccess/pks_mk_noaccess/
s/pks_mkread/pks_mk_readonly/
s/pks_mkrdwr/pks_mk_readwrite/
Comments from Jing Han
Remove duplicate stdio.h
---
 Documentation/core-api/protection-keys.rst |   5 +-
 arch/x86/include/asm/pks.h |  19 +
 arch/x86/mm/fault.c|  15 +
 arch/x86/mm/pkeys.c|   2 +-
 lib/Kconfig.debug  |  11 +
 lib/Makefile   |   3 +
 lib/pks/Makefile   |   3 +
 lib/pks/pks_test.c | 693 +
 mm/Kconfig |   3 +-
 tools/testing/selftests/x86/Makefile   |   3 +-
 tools/testing/selftests/x86/test_pks.c | 150 +
 11 files changed, 903 insertions(+), 4 deletions(-)
 create mode 100644 lib/pks/Makefile
 create mode 100644 lib/pks/pks_test.c
 create mode 100644 tools/testing/selftests/x86/test_pks.c

diff --git a/Documentation/core-api/protection-keys.rst 
b/Documentation/core-api/protection-keys.rst
index 6d6c4f25080c..2bcbb991231b 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -120,7 +120,8 @@ PTE adds this additional protection to the page.
 
 Kernel users intending to use PKS support should check (depend on)
 ARCH_HAS_SUPERVISOR_PKEYS and add their config to ARCH_ENABLE_SUPERVISOR_PKEYS
-to turn on this support within the core.
+to turn on this support within the core.  See the test configuration option
+'PKS_TEST' for an example.
 
 int pks_key_alloc(const char * const pkey_user);
 #define PAGE_KERNEL_PKEY(pkey)
@@ -170,3 +171,5 @@ text:
affected by PKRU register will not execute (even transiently)
until all prior executions of WRPKRU have completed execution
and updated the PKRU register.
+
+Example code can be found in lib/pks/pks_test.c
diff --git a/arch/x86/include/asm/pks.h b/arch/x86/include/asm/pks.h
index 4891c9aa8fc7..9e71322b0cf2 100644
--- a/arch/x86/include/asm/pks.h
+++ b/arch/x86/include/asm/pks.h
@@ -32,4 +32,23 @@ static inline void show_extended_regs_oops(struct pt_regs 
*regs,
 
 #endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
 
+
+#ifdef CONFIG_PKS_TEST
+
+#define __static_or_pks_test
+
+bool handle_pks_test(unsigned long hw_error_code, struct pt_regs *regs);
+bool pks_test_callback(struct pt_regs *regs);
+
+#else /* 

[PATCH V4 08/10] x86/entry: Preserve PKRS MSR across exceptions

2021-03-21 Thread ira . weiny
From: Ira Weiny 

The PKRS MSR is not managed by XSAVE.  It is preserved through a context
switch but this support leaves exception handling code open to memory
accesses during exceptions.

2 possible places for preserving this state were considered,
irqentry_state_t or pt_regs.[1]  pt_regs was much more complicated and
was potentially fraught with unintended consequences.[2]  However, Andy
came up with a way to hide additional values on the stack which could be
accessed as "extended_pt_regs".[3]  This method allows for; any place
which has struct pt_regs can get access to the extra information; no
extra information is added to irq_state; and pt_regs is left intact for
compatibility with outside tools like BPF.

To simplify, the assembly code only adds space on the stack.  The
setting or use of any needed values are left to the C code.  While some
entry points may not use this space it is still added where ever pt_regs
is passed to the C code for consistency.

Each nested exception gets another copy of this extended space allowing
for any number of levels of exception handling.

In the assembly, a macro is defined to allow a central place to add
space for other uses should the need arise.

Finally export pkrs_{save_set|restore}_irq to the common code to allow
it to preserve the current task's PKRS in the new extended pt_regs if
enabled.

Peter, Thomas, Andy, Dave, and Dan all suggested parts of the patch or
aided in the development of the patch..

[1] 
https://lore.kernel.org/lkml/calcetrve1i5jdyzd_bcctxqjn+ze3t38efpgjxn1f577m36...@mail.gmail.com/
[2] https://lore.kernel.org/lkml/874kpxx4jf@nanos.tec.linutronix.de/#t
[3] 
https://lore.kernel.org/lkml/CALCETrUHwZPic89oExMMe-WyDY8-O3W68NcZvse3=PGW+iW5=w...@mail.gmail.com/

Acked-by: Dave Hansen 
Reviewed-by: Dan Williams 
Suggested-by: Dave Hansen 
Suggested-by: Dan Williams 
Suggested-by: Peter Zijlstra 
Suggested-by: Thomas Gleixner 
Suggested-by: Andy Lutomirski 
Signed-off-by: Ira Weiny 

---
Changes from V3:
Fix 0-day issues
Move all extended regs stuff to pks.h
From Dan Williams
Move show_extended_regs_oops ifdefery to pks.h
Remove a bad comment
s/irq_save_set_pkrs/pkrs_save_set_irq
s/irq_restore_pkrs/pkrs_restore_irq
s/ARCH_HAS/ARCH_ENABLE_SUPERVISOR_PKEYS
From Dave Hansen:
remove extra macro parameter for most calls
clarify with comments
Add BUILD check for extend regs size
use subq/addq vs push/pop
Guidance on where to find each of the pt_regs being
passed to C code
From Dan Williams and Dave Hansen:
Use a macro call to wrap the c function calls with
push/pop extended_pt_regs
From Thomas Gleixner:
Remove unnecessary noinstr's
From Andy Lutomirski:
Convert to using the extended pt_regs
Add in showing pks on fault through the extended pt_regs

Changes from V1
remove redundant irq_state->pkrs
This value is only needed for the global tracking.  So
it should be included in that patch and not in this one.

Changes from RFC V3
Standardize on 'irq_state' variable name
Per Dave Hansen
irq_save_pkrs() -> irq_save_set_pkrs()
Rebased based on clean up patch by Thomas Gleixner
This includes moving irq_[save_set|restore]_pkrs() to
the core as well.
---
 arch/x86/entry/calling.h   | 26 
 arch/x86/entry/common.c| 58 ++
 arch/x86/entry/entry_64.S  | 22 +-
 arch/x86/entry/entry_64_compat.S   |  6 +--
 arch/x86/include/asm/pks.h | 16 +++
 arch/x86/include/asm/processor-flags.h |  2 +
 arch/x86/kernel/head_64.S  |  7 ++--
 arch/x86/mm/fault.c|  3 ++
 include/linux/pkeys.h  | 17 
 kernel/entry/common.c  | 14 ++-
 10 files changed, 152 insertions(+), 19 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 07a9331d55e7..ec85f8f675be 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -97,6 +97,32 @@ For 32-bit we have the following conventions - kernel is 
built with
 
 #define SIZEOF_PTREGS  21*8
 
+/*
+ * __call_ext_ptregs - Helper macro to call into C with extended pt_regs
+ * @cfunc: C function to be called
+ *
+ * This will ensure that extended_ptregs is added and removed as needed during
+ * a call into C code.
+ */
+.macro __call_ext_ptregs cfunc annotate_retpoline_safe:req
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+   /* add space for extended_pt_regs */
+   subq$EXTENDED_PT_REGS_SIZE, %rsp
+#endif
+   .if \annotate_retpoline_safe == 1
+   ANNOTATE_RETPOLINE_SAFE
+   

[PATCH V4 05/10] x86/pks: Add PKS setup code

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Protection Keys for Supervisor pages (PKS) enables fast, hardware thread
specific, manipulation of permission restrictions on supervisor page
mappings.  It uses the same mechanism of Protection Keys as those on
User mappings but applies that mechanism to supervisor mappings using a
supervisor specific MSR.

Add setup code and the lowest level of PKS MSR write support.  The write
value is cached per-cpu to avoid the overhead of the MSR write if the
value has not changed.

That said, it should be noted that the underlying WRMSR(MSR_IA32_PKRS)
is not serializing but still maintains ordering properties similar to
WRPKRU.  The current SDM section on PKRS needs updating but should be
the same as that of WRPKRU.  So to quote from the WRPKRU text:

WRPKRU will never execute transiently. Memory accesses affected
by PKRU register will not execute (even transiently) until all
prior executions of WRPKRU have completed execution and updated
the PKRU register.

write_pkrs() contributed by Peter Zijlstra.

Introduce asm/pks.h to declare setup_pks() as an internal function call.
Later patches will also need this new header as a place to declare
internal structures and functions.

Reviewed-by: Dan Williams 
Co-developed-by: Peter Zijlstra 
Signed-off-by: Peter Zijlstra 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Signed-off-by: Ira Weiny 

---
Changes from V3:
From Dan Williams:
Update commit message
Add pks.h to hold ifdefery out of *.c files
s/ARCH_HAS.../SUPERVISOR_PKEYS
move setup_pks to pkeys.c (remove more ifdefery)
Remove 'domain' language from commit message
Clarify comment in fault handler
Move the removal of the WARN_ON_ONCE in the fault path to this
patch.  Previously it was in:
[07/10] x86/fault: Report the PKRS state on fault

Changes from V2
From Thomas: Make this patch last so PKS is not enabled until
all the PKS mechanisms are in place.  Specifically:
1) Modify setup_pks() to call write_pkrs() to properly
   set up the initial value when enabled.

2) Split this patch into two. 1) a precursor patch with
   the required defines/config options and 2) this patch
   which actually enables feature on CPUs which support
   it.

Changes since RFC V3
Per Dave Hansen
Update comment
Add X86_FEATURE_PKS to disabled-features.h
Rebase based on latest TIP tree
---
 arch/x86/include/asm/pks.h   | 15 +++
 arch/x86/kernel/cpu/common.c |  2 ++
 arch/x86/mm/pkeys.c  | 48 
 3 files changed, 65 insertions(+)
 create mode 100644 arch/x86/include/asm/pks.h

diff --git a/arch/x86/include/asm/pks.h b/arch/x86/include/asm/pks.h
new file mode 100644
index ..5d7067ada8fb
--- /dev/null
+++ b/arch/x86/include/asm/pks.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKS_H
+#define _ASM_X86_PKS_H
+
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+
+void setup_pks(void);
+
+#else /* !CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
+
+static inline void setup_pks(void) { }
+
+#endif /* CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS */
+
+#endif /* _ASM_X86_PKS_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ab640abe26b6..de49d0c0f4e0 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu.h"
 
@@ -1594,6 +1595,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 
x86_init_rdrand(c);
setup_pku(c);
+   setup_pks();
 
/*
 * Clear/Set all flags overridden by options, need do it
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index fc8c7e2bb21b..f6a3a54b8d7d 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -229,3 +229,51 @@ u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int 
flags)
 
return pk_reg;
 }
+
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+
+static DEFINE_PER_CPU(u32, pkrs_cache);
+
+/*
+ * write_pkrs() optimizes MSR writes by maintaining a per cpu cache which can
+ * be checked quickly.
+ *
+ * It should also be noted that the underlying WRMSR(MSR_IA32_PKRS) is not
+ * serializing but still maintains ordering properties similar to WRPKRU.
+ * The current SDM section on PKRS needs updating but should be the same as
+ * that of WRPKRU.  So to quote from the WRPKRU text:
+ *
+ * WRPKRU will never execute transiently. Memory accesses
+ * affected by PKRU register will not execute (even transiently)
+ * until all prior executions of WRPKRU have completed execution
+ * and updated the PKRU register.
+ */
+void write_pkrs(u32 new_pkrs)
+{
+   u32 *pkrs;
+
+   if (!static_cpu_has(X86_FEATURE_PKS))
+

[PATCH V4 02/10] x86/fpu: Refactor arch_set_user_pkey_access() for PKS support

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Define a helper, update_pkey_val(), which will be used to support both
Protection Key User (PKU) and the new Protection Key for Supervisor
(PKS) in subsequent patches.

Reviewed-by: Dan Williams 
Co-developed-by: Peter Zijlstra 
Signed-off-by: Peter Zijlstra 
Signed-off-by: Ira Weiny 

---
Changes from RFC V3:
Per Dave Hansen
Update and add comments per Dave's review
Per Peter
Correct attribution
---
 arch/x86/include/asm/pkeys.h |  2 ++
 arch/x86/kernel/fpu/xstate.c | 22 --
 arch/x86/mm/pkeys.c  | 23 +++
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index f9feba80894b..4526245b03e5 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -136,4 +136,6 @@ static inline int vma_pkey(struct vm_area_struct *vma)
return (vma->vm_flags & vma_pkey_mask) >> VM_PKEY_SHIFT;
 }
 
+u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int flags);
+
 #endif /*_ASM_X86_PKEYS_H */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index face29dab0e3..00251bdf759b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -994,9 +994,7 @@ const void *get_xsave_field_ptr(int xfeature_nr)
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val)
 {
-   u32 old_pkru;
-   int pkey_shift = (pkey * PKR_BITS_PER_PKEY);
-   u32 new_pkru_bits = 0;
+   u32 pkru;
 
/*
 * This check implies XSAVE support.  OSPKE only gets
@@ -1012,21 +1010,9 @@ int arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
 */
WARN_ON_ONCE(pkey >= arch_max_pkey());
 
-   /* Set the bits we need in PKRU:  */
-   if (init_val & PKEY_DISABLE_ACCESS)
-   new_pkru_bits |= PKR_AD_BIT;
-   if (init_val & PKEY_DISABLE_WRITE)
-   new_pkru_bits |= PKR_WD_BIT;
-
-   /* Shift the bits in to the correct place in PKRU for pkey: */
-   new_pkru_bits <<= pkey_shift;
-
-   /* Get old PKRU and mask off any old bits in place: */
-   old_pkru = read_pkru();
-   old_pkru &= ~((PKR_AD_BIT|PKR_WD_BIT) << pkey_shift);
-
-   /* Write old part along with new part: */
-   write_pkru(old_pkru | new_pkru_bits);
+   pkru = read_pkru();
+   pkru = update_pkey_val(pkru, pkey, init_val);
+   write_pkru(pkru);
 
return 0;
 }
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index f5efb4007e74..d1dfe743e79f 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -208,3 +208,26 @@ static __init int setup_init_pkru(char *opt)
return 1;
 }
 __setup("init_pkru=", setup_init_pkru);
+
+/*
+ * Replace disable bits for @pkey with values from @flags
+ *
+ * Kernel users use the same flags as user space:
+ * PKEY_DISABLE_ACCESS
+ * PKEY_DISABLE_WRITE
+ */
+u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int flags)
+{
+   int pkey_shift = pkey * PKR_BITS_PER_PKEY;
+
+   /*  Mask out old bit values */
+   pk_reg &= ~(((1 << PKR_BITS_PER_PKEY) - 1) << pkey_shift);
+
+   /*  Or in new values */
+   if (flags & PKEY_DISABLE_ACCESS)
+   pk_reg |= PKR_AD_BIT << pkey_shift;
+   if (flags & PKEY_DISABLE_WRITE)
+   pk_reg |= PKR_WD_BIT << pkey_shift;
+
+   return pk_reg;
+}
-- 
2.28.0.rc0.12.gb6a658bd00c9



[PATCH V4 03/10] x86/pks: Add additional PKEY helper macros

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Avoid open coding shift and mask operations by defining and using helper
macros for PKey operations.

Reviewed-by: Dan Williams 
Signed-off-by: Ira Weiny 

---
Changes from V3:
new patch suggested by Dan Williams to use macros better.
---
 arch/x86/include/asm/pgtable.h  |  7 ++-
 arch/x86/include/asm/pkeys_common.h | 11 ---
 arch/x86/mm/pkeys.c |  8 +++-
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index bfbfb951fe65..b1529b44a996 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1370,19 +1370,16 @@ extern u32 init_pkru_value;
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY;
-
-   return !(pkru & (PKR_AD_BIT << pkru_pkey_bits));
+   return !(pkru & PKR_AD_KEY(pkey));
 }
 
 static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY;
/*
 * Access-disable disables writes too so we need to check
 * both bits here.
 */
-   return !(pkru & ((PKR_AD_BIT|PKR_WD_BIT) << pkru_pkey_bits));
+   return !(pkru & (PKR_AD_KEY(pkey) | PKR_WD_KEY(pkey)));
 }
 
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
diff --git a/arch/x86/include/asm/pkeys_common.h 
b/arch/x86/include/asm/pkeys_common.h
index e40b0ced733f..0681522974ba 100644
--- a/arch/x86/include/asm/pkeys_common.h
+++ b/arch/x86/include/asm/pkeys_common.h
@@ -6,10 +6,15 @@
 #define PKR_WD_BIT 0x2
 #define PKR_BITS_PER_PKEY 2
 
+#define PKR_PKEY_SHIFT(pkey) (pkey * PKR_BITS_PER_PKEY)
+#define PKR_PKEY_MASK(pkey)  (((1 << PKR_BITS_PER_PKEY) - 1) << 
PKR_PKEY_SHIFT(pkey))
+
 /*
- * Generate an Access-Disable mask for the given pkey.  Several of these can be
- * OR'd together to generate pkey register values.
+ * Generate an Access-Disable and Write-Disable mask for the given pkey.
+ * Several of the AD's are OR'd together to generate a default pkey register
+ * value.
  */
-#define PKR_AD_KEY(pkey)   (PKR_AD_BIT << ((pkey) * PKR_BITS_PER_PKEY))
+#define PKR_AD_KEY(pkey) (PKR_AD_BIT << PKR_PKEY_SHIFT(pkey))
+#define PKR_WD_KEY(pkey) (PKR_WD_BIT << PKR_PKEY_SHIFT(pkey))
 
 #endif /*_ASM_X86_PKEYS_COMMON_H */
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index d1dfe743e79f..fc8c7e2bb21b 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -218,16 +218,14 @@ __setup("init_pkru=", setup_init_pkru);
  */
 u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int flags)
 {
-   int pkey_shift = pkey * PKR_BITS_PER_PKEY;
-
/*  Mask out old bit values */
-   pk_reg &= ~(((1 << PKR_BITS_PER_PKEY) - 1) << pkey_shift);
+   pk_reg &= ~PKR_PKEY_MASK(pkey);
 
/*  Or in new values */
if (flags & PKEY_DISABLE_ACCESS)
-   pk_reg |= PKR_AD_BIT << pkey_shift;
+   pk_reg |= PKR_AD_KEY(pkey);
if (flags & PKEY_DISABLE_WRITE)
-   pk_reg |= PKR_WD_BIT << pkey_shift;
+   pk_reg |= PKR_WD_KEY(pkey);
 
return pk_reg;
 }
-- 
2.28.0.rc0.12.gb6a658bd00c9



[PATCH V4 06/10] x86/fault: Adjust WARN_ON for PKey fault

2021-03-21 Thread ira . weiny
From: Ira Weiny 

PKey faults may now happen on kernel mappings if the feature is enabled.
Remove the warning in the fault path if PKS is enabled.

Reviewed-by: Dan Williams 
Signed-off-by: Ira Weiny 
---
 arch/x86/mm/fault.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a73347e2cdfc..731ec90ed413 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1141,11 +1141,12 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long 
hw_error_code,
   unsigned long address)
 {
/*
-* Protection keys exceptions only happen on user pages.  We
-* have no user pages in the kernel portion of the address
-* space, so do not expect them here.
+* PF_PK is expected on kernel addresses when supervisor pkeys are
+* enabled.
 */
-   WARN_ON_ONCE(hw_error_code & X86_PF_PK);
+   if (!cpu_feature_enabled(X86_FEATURE_PKS))
+   WARN_ON_ONCE(hw_error_code & X86_PF_PK);
+
 
 #ifdef CONFIG_X86_32
/*
-- 
2.28.0.rc0.12.gb6a658bd00c9



[PATCH V4 04/10] x86/pks: Add PKS defines and Kconfig options

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Protection Keys for Supervisor pages (PKS) enables fast, hardware thread
specific, manipulation of permission restrictions on supervisor page
mappings.  It uses the same mechanism of Protection Keys as those on
User mappings but applies that mechanism to supervisor mappings using a
supervisor specific MSR.

Kernel users can define domains of page mappings which have an extra
level of protection beyond those specified in the supervisor page table
entries.

Define the PKS CPU feature bits.

Add the Kconfig ARCH_HAS_SUPERVISOR_PKEYS to indicate to consumers that
an architecture supports pkeys.

Introduce ARCH_ENABLE_SUPERVISOR_PKEYS to allow kernel users to specify
to the arch that they wish to use the supervisor key support if
ARCH_HAS_SUPERVISOR_PKEYS is available.  ARCH_ENABLE_SUPERVISOR_PKEYS
remains off until the first use case sets it.

Reviewed-by: Dan Williams 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Signed-off-by: Ira Weiny 

---
Changes from V3:
From Dan
Clean up commit message
Add ARCH_ENABLE_SUPERVISOR_PKEYS option so we don't have
the overhead of PKS unless there is a user
Clean up commit message grammar

Changes from V2
New patch for V3:  Split this off from the enable patch to be
able to create cleaner bisectability
---
 arch/x86/Kconfig| 1 +
 arch/x86/include/asm/cpufeatures.h  | 1 +
 arch/x86/include/asm/disabled-features.h| 8 +++-
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 mm/Kconfig  | 4 
 5 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2792879d398e..5e3a7c2bc342 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1870,6 +1870,7 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
depends on X86_64 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
+   select ARCH_HAS_SUPERVISOR_PKEYS
help
  Memory Protection Keys provides a mechanism for enforcing
  page-based protections, but without requiring modification of the
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index cc96e26d69f7..83ed73407417 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -359,6 +359,7 @@
 #define X86_FEATURE_MOVDIR64B  (16*32+28) /* MOVDIR64B instruction */
 #define X86_FEATURE_ENQCMD (16*32+29) /* ENQCMD and ENQCMDS 
instructions */
 #define X86_FEATURE_SGX_LC (16*32+30) /* Software Guard Extensions 
Launch Control */
+#define X86_FEATURE_PKS(16*32+31) /* Protection Keys 
for Supervisor pages */
 
 /* AMD-defined CPU features, CPUID level 0x8007 (EBX), word 17 */
 #define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery 
support */
diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index b7dd944dc867..fd09ae852c04 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -44,6 +44,12 @@
 # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31))
 #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */
 
+#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
+# define DISABLE_PKS   0
+#else
+# define DISABLE_PKS   (1<<(X86_FEATURE_PKS & 31))
+#endif
+
 #ifdef CONFIG_X86_5LEVEL
 # define DISABLE_LA57  0
 #else
@@ -88,7 +94,7 @@
 #define DISABLED_MASK140
 #define DISABLED_MASK150
 #define DISABLED_MASK16
(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
-DISABLE_ENQCMD)
+DISABLE_ENQCMD|DISABLE_PKS)
 #define DISABLED_MASK170
 #define DISABLED_MASK180
 #define DISABLED_MASK190
diff --git a/arch/x86/include/uapi/asm/processor-flags.h 
b/arch/x86/include/uapi/asm/processor-flags.h
index bcba3c643e63..191c574b2390 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -130,6 +130,8 @@
 #define X86_CR4_SMAP   _BITUL(X86_CR4_SMAP_BIT)
 #define X86_CR4_PKE_BIT22 /* enable Protection Keys support */
 #define X86_CR4_PKE_BITUL(X86_CR4_PKE_BIT)
+#define X86_CR4_PKS_BIT24 /* enable Protection Keys for 
Supervisor */
+#define X86_CR4_PKS_BITUL(X86_CR4_PKS_BIT)
 
 /*
  * x86-64 Task Priority Register, CR8
diff --git a/mm/Kconfig b/mm/Kconfig
index 24c045b24b95..c7d1fc780358 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -808,6 +808,10 @@ config ARCH_USES_HIGH_VMA_FLAGS
bool
 config ARCH_HAS_PKEYS
bool
+config ARCH_HAS_SUPERVISOR_PKEYS
+   bool
+config ARCH_ENABLE_SUPERVISOR_PKEYS
+   bool
 
 config PERCPU_STATS
bool "Collect percpu memory statistics"
-- 
2.28.0.rc0.12.gb6a658bd00c9



[PATCH V4 01/10] x86/pkeys: Create pkeys_common.h

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Protection Keys User (PKU) and Protection Keys Supervisor (PKS) work in
similar fashions and can share common defines.  Specifically PKS and PKU
each have:

1. A single control register
2. The same number of keys
3. The same number of bits in the register per key
4. Access and Write disable in the same bit locations

Given the above, share all the macros that synthesize and manipulate
register values between the two features.  Unlike PKU the PKS
definitions are needed in both pgtable.h and pkeys.h.  Create a common
header for those 2 headers to share. The alternative, including
pgtable.h in pkeys.h, triggers complex header dependencies.

Share these defines by moving them into a new header, change their names
to reflect the common use, and include the header where needed.

Reviewed-by: Dan Williams 
Signed-off-by: Ira Weiny 

---
NOTE: The initialization of init_pkru_value cause checkpatch errors
because of the space after the '(' in the macros.  We leave this as is
because it is more readable in this format.  And it was existing code.

---
Changes from V3:
From Dan Williams
Fix guard macro names
Reword commit message.

Changes from RFC V3
Per Dave Hansen
Update commit message
Add comment to PKR_AD_KEY macro
---
 arch/x86/include/asm/pgtable.h  | 13 ++---
 arch/x86/include/asm/pkeys.h|  2 ++
 arch/x86/include/asm/pkeys_common.h | 15 +++
 arch/x86/kernel/fpu/xstate.c|  8 
 arch/x86/mm/pkeys.c | 14 ++
 5 files changed, 33 insertions(+), 19 deletions(-)
 create mode 100644 arch/x86/include/asm/pkeys_common.h

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index a02c67291cfc..bfbfb951fe65 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1360,9 +1360,7 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
 }
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
 
-#define PKRU_AD_BIT 0x1
-#define PKRU_WD_BIT 0x2
-#define PKRU_BITS_PER_PKEY 2
+#include 
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 extern u32 init_pkru_value;
@@ -1372,18 +1370,19 @@ extern u32 init_pkru_value;
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-   return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+   int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY;
+
+   return !(pkru & (PKR_AD_BIT << pkru_pkey_bits));
 }
 
 static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+   int pkru_pkey_bits = pkey * PKR_BITS_PER_PKEY;
/*
 * Access-disable disables writes too so we need to check
 * both bits here.
 */
-   return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+   return !(pkru & ((PKR_AD_BIT|PKR_WD_BIT) << pkru_pkey_bits));
 }
 
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 2ff9b98812b7..f9feba80894b 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_PKEYS_H
 #define _ASM_X86_PKEYS_H
 
+#include 
+
 #define ARCH_DEFAULT_PKEY  0
 
 /*
diff --git a/arch/x86/include/asm/pkeys_common.h 
b/arch/x86/include/asm/pkeys_common.h
new file mode 100644
index ..e40b0ced733f
--- /dev/null
+++ b/arch/x86/include/asm/pkeys_common.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKEYS_COMMON_H
+#define _ASM_X86_PKEYS_COMMON_H
+
+#define PKR_AD_BIT 0x1
+#define PKR_WD_BIT 0x2
+#define PKR_BITS_PER_PKEY 2
+
+/*
+ * Generate an Access-Disable mask for the given pkey.  Several of these can be
+ * OR'd together to generate pkey register values.
+ */
+#define PKR_AD_KEY(pkey)   (PKR_AD_BIT << ((pkey) * PKR_BITS_PER_PKEY))
+
+#endif /*_ASM_X86_PKEYS_COMMON_H */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 683749b80ae2..face29dab0e3 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -995,7 +995,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
unsigned long init_val)
 {
u32 old_pkru;
-   int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
+   int pkey_shift = (pkey * PKR_BITS_PER_PKEY);
u32 new_pkru_bits = 0;
 
/*
@@ -1014,16 +1014,16 @@ int arch_set_user_pkey_access(struct task_struct *tsk, 
int pkey,
 
/* Set the bits we need in PKRU:  */
if (init_val & PKEY_DISABLE_ACCESS)
-   new_pkru_bits |= PKRU_AD_BIT;
+   new_pkru_bits |= PKR_AD_BIT;
if (init_val & PKEY_DISABLE_WRITE)
-   new_pkru_bits |= PKRU_WD_BIT;
+   new_pkru_bits |= PKR_WD_BIT;
 
/* Shift the bits in to the correct place in PKRU for pkey: 

[PATCH V4 00/10] PKS Add Protection Key Supervisor support

2021-03-21 Thread ira . weiny
From: Ira Weiny 

Introduce a new page protection mechanism for supervisor pages, Protection Key
Supervisor (PKS).

Generally PKS enables protections on 'domains' of supervisor pages to limit
supervisor mode access to pages beyond the normal paging protections.  PKS
works in a similar fashion to user space pkeys, PKU.  As with PKU, supervisor
pkeys are checked in addition to normal paging protections and Access or Writes
can be disabled via a MSR update without TLB flushes when permissions change.

Also like PKU, a page mapping is assigned to a domain by setting pkey bits in
the page table entry for that mapping.

Access is controlled through a PKRS register which is updated via WRMSR/RDMSR.

XSAVE is not supported for the PKRS MSR.  Therefore the implementation
saves/restores the MSR across context switches and during exceptions.  Nested
exceptions are supported by each exception getting a new PKS state.

For consistent behavior with current paging protections, pkey 0 is reserved and
configured to allow full access via the pkey mechanism, thus preserving the
default paging protections on mappings with the default pkey value of 0.

Other keys, (1-15) are allocated by an allocator which prepares us for key
contention from day one.  Kernel users should be prepared for the allocator to
fail either because of key exhaustion or due to PKS not being supported on the
CPU instance.

The following are key attributes of PKS.

   1) Fast switching of permissions
1a) Prevents access without page table manipulations
1b) No TLB flushes required
   2) Works on a per thread basis

PKS is available with 4 and 5 level paging.  Like PKRU it consumes 4 bits from
the PTE to store the pkey within the entry.

All code to support PKS is configured via ARCH_ENABLE_SUPERVISOR_PKEYS which
is designed to only be turned on when a user is configured on in the kernel.
Those users must depend on ARCH_HAS_SUPERVISOR_PKEYS to properly work with
other architectures which do not yet support PKS.

Originally this series was submitted as part of a large patch set which
converted the kmap call sites.[1]

Many follow on discussions revealed a few problems.  The first of which was
that some callers leak a kmap mapping across threads rather than containing it
to a critical section.  Attempts were made to see if these 'global kmaps' could
be supported.[2]  However, supporting global kmaps had many problems.  Work is
being done in parallel on converting as many kmap calls to the new
kmap_local_page().[3]


Changes from V3 [4]
Add ARCH_ENABLE_SUPERVISOR_PKEYS config which is selected by kernel
users to add the functionality to the core.  However, they should only
select this if ARCH_HAS_SUPERVISOR_PKEYS is available.
Clean up test code for context switching
Adjust for extended_pt_regs
Reduce output unless --debug is specified
Address internal review comments from Dan Williams and Dave Hansen
Help with macros and assembly coding
Change names of various functions
Clean up documentation
Move all #ifdefery into header files.
Clean up cover letter.
Make extended_pt_regs handling a macro rather than coding
around every call to C
Add macross for PKS shift/mask
New patch : x86/pks: Add additional PKEY helper macros
Preserve pkrs_cache as static when PKS_TEST is not configured
Remove unnecessary pr_* prints
Clarify pks_key_alloc flags parameter
Change CONFIG_PKS_TESTING to CONFIG_PKS_TEST
Clean up test code separation from main code in fault.c
Remove module boilerplate from test code
Clean up all commit messages
Address comments from Thomas Gleixner
Provide a warning and fallback to no protection if a global
mapping is requested.
Fix context switch.  Fix where pks_sched_in() is called.
Fix test to actually do a context switch
Remove unecessary noinstr's
From Andy Lutomirski
Use extended_pt_regs idea to stash pks values on the stack
Drop patches 5/10 and 7/10
And use extended_pt_regs to print pkey info on fault
Adjust tests
Comments from Randy Dunlap:
Fix gramatical errors in doc
Clean up kernel docs
Rebase to 5.12


[1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.we...@intel.com/

[2] https://lore.kernel.org/lkml/87mtycqcjf@nanos.tec.linutronix.de/

[3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.we...@intel.com/
https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.we...@intel.com/
https://lore.kernel.org/lkml/20210205170030.856723-1-ira.we...@intel.com/
   

Re: [PATCH] KVM: x86: A typo fix

2021-03-21 Thread Bhaskar Chowdhury

On 23:54 Sun 21 Mar 2021, Ingo Molnar wrote:



These single file typo fixes are a bad idea for another reason as
well, as they create a lot of unnecessary churn.


Huh! I was expecting it from the moment I started doing it ...finally it 
arrives.

I am not sure about "so called workflowo of others" ..I am gonna do it in my
way as long as it providing good.

I think this is best way to do it.

~Bhaskar



signature.asc
Description: PGP signature


[PATCH 5/5] MAINTAINERS: add cifsd kernel server

2021-03-21 Thread Namjae Jeon
Add myself, Steve French, Sergey Senozhatsky and Hyunchul Lee
as cifsd maintainer.

Signed-off-by: Namjae Jeon 
Signed-off-by: Sergey Senozhatsky 
Signed-off-by: Hyunchul Lee 
Acked-by: Ronnie Sahlberg 
Signed-off-by: Steve French 
---
 MAINTAINERS | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index aa84121c5611..30f678f8b4d3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4434,7 +4434,7 @@ F:include/linux/clk/
 F: include/linux/of_clk.h
 X: drivers/clk/clkdev.c
 
-COMMON INTERNET FILE SYSTEM (CIFS)
+COMMON INTERNET FILE SYSTEM CLIENT (CIFS)
 M: Steve French 
 L: linux-c...@vger.kernel.org
 L: samba-techni...@lists.samba.org (moderated for non-subscribers)
@@ -,6 +,16 @@ T:   git git://git.samba.org/sfrench/cifs-2.6.git
 F: Documentation/admin-guide/cifs/
 F: fs/cifs/
 
+COMMON INTERNET FILE SYSTEM SERVER (CIFSD)
+M: Namjae Jeon 
+M: Sergey Senozhatsky 
+M: Steve French 
+M: Hyunchul Lee 
+L: linux-c...@vger.kernel.org
+L: linux-cifsd-de...@lists.sourceforge.net
+S: Maintained
+F: fs/cifsd/
+
 COMPACTPCI HOTPLUG CORE
 M: Scott Murray 
 L: linux-...@vger.kernel.org
-- 
2.17.1



[PATCH 3/5] cifsd: add file operations

2021-03-21 Thread Namjae Jeon
This adds file operations and buffer pool for cifsd.

Signed-off-by: Namjae Jeon 
Signed-off-by: Sergey Senozhatsky 
Signed-off-by: Hyunchul Lee 
Acked-by: Ronnie Sahlberg 
Signed-off-by: Steve French 
---
 fs/cifsd/buffer_pool.c |  292 ++
 fs/cifsd/buffer_pool.h |   28 +
 fs/cifsd/vfs.c | 1989 
 fs/cifsd/vfs.h |  314 +++
 fs/cifsd/vfs_cache.c   |  851 +
 fs/cifsd/vfs_cache.h   |  213 +
 6 files changed, 3687 insertions(+)
 create mode 100644 fs/cifsd/buffer_pool.c
 create mode 100644 fs/cifsd/buffer_pool.h
 create mode 100644 fs/cifsd/vfs.c
 create mode 100644 fs/cifsd/vfs.h
 create mode 100644 fs/cifsd/vfs_cache.c
 create mode 100644 fs/cifsd/vfs_cache.h

diff --git a/fs/cifsd/buffer_pool.c b/fs/cifsd/buffer_pool.c
new file mode 100644
index ..864fea547c68
--- /dev/null
+++ b/fs/cifsd/buffer_pool.c
@@ -0,0 +1,292 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ *   Copyright (C) 2018 Samsung Electronics Co., Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "glob.h"
+#include "buffer_pool.h"
+#include "connection.h"
+#include "mgmt/ksmbd_ida.h"
+
+static struct kmem_cache *filp_cache;
+
+struct wm {
+   struct list_headlist;
+   unsigned intsz;
+   charbuffer[0];
+};
+
+struct wm_list {
+   struct list_headlist;
+   unsigned intsz;
+
+   spinlock_t  wm_lock;
+   int avail_wm;
+   struct list_headidle_wm;
+   wait_queue_head_t   wm_wait;
+};
+
+static LIST_HEAD(wm_lists);
+static DEFINE_RWLOCK(wm_lists_lock);
+
+void *ksmbd_alloc(size_t size)
+{
+   return kvmalloc(size, GFP_KERNEL | __GFP_ZERO);
+}
+
+void ksmbd_free(void *ptr)
+{
+   kvfree(ptr);
+}
+
+static struct wm *wm_alloc(size_t sz, gfp_t flags)
+{
+   struct wm *wm;
+   size_t alloc_sz = sz + sizeof(struct wm);
+
+   wm = kvmalloc(alloc_sz, flags);
+   if (!wm)
+   return NULL;
+   wm->sz = sz;
+   return wm;
+}
+
+static int register_wm_size_class(size_t sz)
+{
+   struct wm_list *l, *nl;
+
+   nl = kvmalloc(sizeof(struct wm_list), GFP_KERNEL);
+   if (!nl)
+   return -ENOMEM;
+
+   nl->sz = sz;
+   spin_lock_init(>wm_lock);
+   INIT_LIST_HEAD(>idle_wm);
+   INIT_LIST_HEAD(>list);
+   init_waitqueue_head(>wm_wait);
+   nl->avail_wm = 0;
+
+   write_lock(_lists_lock);
+   list_for_each_entry(l, _lists, list) {
+   if (l->sz == sz) {
+   write_unlock(_lists_lock);
+   kvfree(nl);
+   return 0;
+   }
+   }
+
+   list_add(>list, _lists);
+   write_unlock(_lists_lock);
+   return 0;
+}
+
+static struct wm_list *match_wm_list(size_t size)
+{
+   struct wm_list *l, *rl = NULL;
+
+   read_lock(_lists_lock);
+   list_for_each_entry(l, _lists, list) {
+   if (l->sz == size) {
+   rl = l;
+   break;
+   }
+   }
+   read_unlock(_lists_lock);
+   return rl;
+}
+
+static struct wm *find_wm(size_t size)
+{
+   struct wm_list *wm_list;
+   struct wm *wm;
+
+   wm_list = match_wm_list(size);
+   if (!wm_list) {
+   if (register_wm_size_class(size))
+   return NULL;
+   wm_list = match_wm_list(size);
+   }
+
+   if (!wm_list)
+   return NULL;
+
+   while (1) {
+   spin_lock(_list->wm_lock);
+   if (!list_empty(_list->idle_wm)) {
+   wm = list_entry(wm_list->idle_wm.next,
+   struct wm,
+   list);
+   list_del(>list);
+   spin_unlock(_list->wm_lock);
+   return wm;
+   }
+
+   if (wm_list->avail_wm > num_online_cpus()) {
+   spin_unlock(_list->wm_lock);
+   wait_event(wm_list->wm_wait,
+  !list_empty(_list->idle_wm));
+   continue;
+   }
+
+   wm_list->avail_wm++;
+   spin_unlock(_list->wm_lock);
+
+   wm = wm_alloc(size, GFP_KERNEL);
+   if (!wm) {
+   spin_lock(_list->wm_lock);
+   wm_list->avail_wm--;
+   spin_unlock(_list->wm_lock);
+   wait_event(wm_list->wm_wait,
+  !list_empty(_list->idle_wm));
+   continue;
+   }
+   break;
+   }
+
+   return wm;
+}
+
+static void release_wm(struct wm *wm, struct wm_list *wm_list)
+{
+   if (!wm)
+   return;
+
+   spin_lock(_list->wm_lock);
+  

[PATCH 4/5] cifsd: add Kconfig and Makefile

2021-03-21 Thread Namjae Jeon
This adds the Kconfig and Makefile for cifsd.

Signed-off-by: Namjae Jeon 
Signed-off-by: Sergey Senozhatsky 
Signed-off-by: Hyunchul Lee 
Acked-by: Ronnie Sahlberg 
Signed-off-by: Steve French 
---
 fs/Kconfig|  1 +
 fs/Makefile   |  1 +
 fs/cifsd/Kconfig  | 64 +++
 fs/cifsd/Makefile | 13 ++
 4 files changed, 79 insertions(+)
 create mode 100644 fs/cifsd/Kconfig
 create mode 100644 fs/cifsd/Makefile

diff --git a/fs/Kconfig b/fs/Kconfig
index a55bda4233bb..92deb66021d1 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -340,6 +340,7 @@ config NFS_V4_2_SSC_HELPER
 source "net/sunrpc/Kconfig"
 source "fs/ceph/Kconfig"
 source "fs/cifs/Kconfig"
+source "fs/cifsd/Kconfig"
 source "fs/coda/Kconfig"
 source "fs/afs/Kconfig"
 source "fs/9p/Kconfig"
diff --git a/fs/Makefile b/fs/Makefile
index 3215fe205256..62dc87f3ff94 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -97,6 +97,7 @@ obj-$(CONFIG_NLS) += nls/
 obj-$(CONFIG_UNICODE)  += unicode/
 obj-$(CONFIG_SYSV_FS)  += sysv/
 obj-$(CONFIG_CIFS) += cifs/
+obj-$(CONFIG_SMB_SERVER)   += cifsd/
 obj-$(CONFIG_HPFS_FS)  += hpfs/
 obj-$(CONFIG_NTFS_FS)  += ntfs/
 obj-$(CONFIG_UFS_FS)   += ufs/
diff --git a/fs/cifsd/Kconfig b/fs/cifsd/Kconfig
new file mode 100644
index ..6e78960be5b1
--- /dev/null
+++ b/fs/cifsd/Kconfig
@@ -0,0 +1,64 @@
+config SMB_SERVER
+   tristate "SMB server support (EXPERIMENTAL)"
+   depends on INET
+   select NLS
+   select NLS_UTF8
+   select CRYPTO
+   select CRYPTO_MD4
+   select CRYPTO_MD5
+   select CRYPTO_HMAC
+   select CRYPTO_ARC4
+   select CRYPTO_ECB
+   select CRYPTO_LIB_DES
+   select CRYPTO_SHA256
+   select CRYPTO_CMAC
+   select CRYPTO_SHA512
+   select CRYPTO_AEAD2
+   select CRYPTO_CCM
+   select CRYPTO_GCM
+   default n
+   help
+ Choose Y here if you want to allow SMB3 compliant clients
+ to access files residing on this system using SMB3 protocol.
+ To compile the SMB3 server support as a module,
+ choose M here: the module will be called ksmbd.
+
+ You may choose to use a samba server instead, in which
+ case you can choose N here.
+
+ You also need to install user space programs which can be found
+ in cifsd-tools, available from
+ https://github.com/cifsd-team/cifsd-tools.
+ More detail about how to run the cifsd kernel server is
+ available via README file
+ (https://github.com/cifsd-team/cifsd-tools/blob/master/README).
+
+ cifsd kernel server includes support for auto-negotiation,
+ Secure negotiate, Pre-authentication integrity, oplock/lease,
+ compound requests, multi-credit, packet signing, RDMA(smbdirect),
+ smb3 encryption, copy-offload, secure per-user session
+ establishment via NTLM or NTLMv2.
+
+config SMB_SERVER_SMBDIRECT
+   bool "Support for SMB Direct protocol"
+   depends on SMB_SERVER=m && INFINIBAND && INFINIBAND_ADDR_TRANS || 
SMB_SERVER=y && INFINIBAND=y && INFINIBAND_ADDR_TRANS=y
+   default n
+
+   help
+ Enables SMB Direct support for SMB 3.0, 3.02 and 3.1.1.
+
+ SMB Direct allows transferring SMB packets over RDMA. If unsure,
+ say N.
+
+config SMB_SERVER_CHECK_CAP_NET_ADMIN
+   bool "Enable check network administration capability"
+   depends on SMB_SERVER
+   default y
+
+   help
+ Prevent unprivileged processes to start the cifsd kernel server.
+
+config SMB_SERVER_KERBEROS5
+   bool "Support for Kerberos 5"
+   depends on SMB_SERVER
+   default n
diff --git a/fs/cifsd/Makefile b/fs/cifsd/Makefile
new file mode 100644
index ..a6c03c4ba51e
--- /dev/null
+++ b/fs/cifsd/Makefile
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# Makefile for Linux SMB3 kernel server
+#
+obj-$(CONFIG_SMB_SERVER) += ksmbd.o
+
+ksmbd-y := unicode.o auth.o vfs.o vfs_cache.o server.o buffer_pool.o \
+   misc.o oplock.o connection.o ksmbd_work.o crypto_ctx.o \
+   mgmt/ksmbd_ida.o mgmt/user_config.o mgmt/share_config.o \
+   mgmt/tree_connect.o mgmt/user_session.o smb_common.o \
+   transport_tcp.o transport_ipc.o smbacl.o smb2pdu.o \
+   smb2ops.o smb2misc.o asn1.o netmisc.o ndr.o
+ksmbd-$(CONFIG_SMB_SERVER_SMBDIRECT) += transport_rdma.o
-- 
2.17.1



[PATCH 0/5] cifsd: introduce new SMB3 kernel server

2021-03-21 Thread Namjae Jeon
This is the patch series for cifsd(ksmbd) kernel server.

What is cifsd(ksmbd) ?
==

The SMB family of protocols is the most widely deployed
network filesystem protocol, the default on Windows and Macs (and even
on many phones and tablets), with clients and servers on all major
operating systems, but lacked a kernel server for Linux. For many
cases the current userspace server choices were suboptimal
either due to memory footprint, performance or difficulty integrating
well with advanced Linux features.

ksmbd is a new kernel module which implements the server-side of the SMB3 
protocol.
The target is to provide optimized performance, GPLv2 SMB server, better
lease handling (distributed caching). The bigger goal is to add new
features more rapidly (e.g. RDMA aka "smbdirect", and recent encryption
and signing improvements to the protocol) which are easier to develop
on a smaller, more tightly optimized kernel server than for example
in Samba.  The Samba project is much broader in scope (tools, security services,
LDAP, Active Directory Domain Controller, and a cross platform file server
for a wider variety of purposes) but the user space file server portion
of Samba has proved hard to optimize for some Linux workloads, including
for smaller devices. This is not meant to replace Samba, but rather be
an extension to allow better optimizing for Linux, and will continue to
integrate well with Samba user space tools and libraries where appropriate.
Working with the Samba team we have already made sure that the configuration
files and xattrs are in a compatible format between the kernel and
user space server.


Architecture


   |--- ...
   |--- ksmbd/3 - Client 3
   |---|--- ksmbd/2 - Client 2
   |   | 
   |   ||- Client 1  |
<--- Socket ---|--- ksmbd/1   <<= Authentication : NTLM/NTLM2, Kerberos  |
   |   |  | | <<= SMB engine : SMB2, SMB2.1, SMB3, SMB3.0.2, |
   |   |  | |SMB3.1.1|
   |   |  | ||
   |   |  |
   |   |  |--- VFS --- Local Filesystem
   |   |
KERNEL |--- ksmbd/0(forker kthread)
---||---
USER   ||
   || communication using NETLINK
   ||  __
   || |  |
ksmbd.mountd <<= DCE/RPC(srvsvc, wkssvc, samr, lsarpc)   |
   ^  |  <<= configure shares setting, user accounts |
   |  |__|
   |
   |-- smb.conf(config file)
   |
   |-- ksmbdpwd.db(user account/password file)
^
  ksmbd.adduser |

The subset of performance related operations(open/read/write/close etc.) belong
in kernelspace(ksmbd) and the other subset which belong to operations(DCE/RPC,
user account/share database) which are not really related with performance are
handled in userspace(ksmbd.mountd).

When the ksmbd.mountd is started, It starts up a forker thread at initialization
time and opens a dedicated port 445 for listening to SMB requests. Whenever new
clients make request, Forker thread will accept the client connection and fork
a new thread for dedicated communication channel between the client and
the server.


ksmbd feature status


== =
Feature name   Status
== =
Dialects   Supported. SMB2.1 SMB3.0, SMB3.1.1 dialects
   (intentionally excludes security vulnerable SMB1 
dialect).
Auto Negotiation   Supported.
Compound Request   Supported.
Oplock Cache Mechanism Supported.
SMB2 leases(v1 lease)  Supported.
Directory leases(v2 lease) Planned for future.
Multi-credits  Supported.
NTLM/NTLMv2Supported.
HMAC-SHA256 SigningSupported.
Secure negotiate   Supported.
Signing Update Supported.
Pre-authentication integrity   Supported.
SMB3 encryption(CCM, GCM)  Supported. (CCM and GCM128 supported, GCM256 in 
progress)
SMB direct(RDMA)   Partially Supported. SMB3 Multi-channel is 
required
   to connect to Windows client.
SMB3 Multi-channel In Progress.
SMB3.1.1 POSIX extension   Supported.
ACLs   Partially Supported. only DACLs available, SACLs
   (auditing) is planned 

[PATCH RESEND] random: remove dead code left over from blocking pool

2021-03-21 Thread Eric Biggers
From: Eric Biggers 

Remove some dead code that was left over following commit 90ea1c6436d2
("random: remove the blocking pool").

Cc: linux-cry...@vger.kernel.org
Cc: Andy Lutomirski 
Cc: Jann Horn 
Cc: Theodore Ts'o 
Reviewed-by: Andy Lutomirski 
Acked-by: Ard Biesheuvel 
Signed-off-by: Eric Biggers 
---
 drivers/char/random.c | 17 ++-
 include/trace/events/random.h | 83 ---
 2 files changed, 3 insertions(+), 97 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 5d6acfecd919b..605969ed0f965 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -500,7 +500,6 @@ struct entropy_store {
unsigned short add_ptr;
unsigned short input_rotate;
int entropy_count;
-   unsigned int initialized:1;
unsigned int last_data_init:1;
__u8 last_data[EXTRACT_SIZE];
 };
@@ -660,7 +659,7 @@ static void process_random_ready_list(void)
  */
 static void credit_entropy_bits(struct entropy_store *r, int nbits)
 {
-   int entropy_count, orig, has_initialized = 0;
+   int entropy_count, orig;
const int pool_size = r->poolinfo->poolfracbits;
int nfrac = nbits << ENTROPY_SHIFT;
 
@@ -717,23 +716,14 @@ static void credit_entropy_bits(struct entropy_store *r, 
int nbits)
if (cmpxchg(>entropy_count, orig, entropy_count) != orig)
goto retry;
 
-   if (has_initialized) {
-   r->initialized = 1;
-   kill_fasync(, SIGIO, POLL_IN);
-   }
-
trace_credit_entropy_bits(r->name, nbits,
  entropy_count >> ENTROPY_SHIFT, _RET_IP_);
 
if (r == _pool) {
int entropy_bits = entropy_count >> ENTROPY_SHIFT;
 
-   if (crng_init < 2) {
-   if (entropy_bits < 128)
-   return;
+   if (crng_init < 2 && entropy_bits >= 128)
crng_reseed(_crng, r);
-   entropy_bits = ENTROPY_BITS(r);
-   }
}
 }
 
@@ -1372,8 +1362,7 @@ static size_t account(struct entropy_store *r, size_t 
nbytes, int min,
 }
 
 /*
- * This function does the actual extraction for extract_entropy and
- * extract_entropy_user.
+ * This function does the actual extraction for extract_entropy.
  *
  * Note: we assume that .poolwords is a multiple of 16 words.
  */
diff --git a/include/trace/events/random.h b/include/trace/events/random.h
index 9570a10cb949b..3d7b432ca5f31 100644
--- a/include/trace/events/random.h
+++ b/include/trace/events/random.h
@@ -85,28 +85,6 @@ TRACE_EVENT(credit_entropy_bits,
  __entry->entropy_count, (void *)__entry->IP)
 );
 
-TRACE_EVENT(push_to_pool,
-   TP_PROTO(const char *pool_name, int pool_bits, int input_bits),
-
-   TP_ARGS(pool_name, pool_bits, input_bits),
-
-   TP_STRUCT__entry(
-   __field( const char *,  pool_name   )
-   __field(  int,  pool_bits   )
-   __field(  int,  input_bits  )
-   ),
-
-   TP_fast_assign(
-   __entry->pool_name  = pool_name;
-   __entry->pool_bits  = pool_bits;
-   __entry->input_bits = input_bits;
-   ),
-
-   TP_printk("%s: pool_bits %d input_pool_bits %d",
- __entry->pool_name, __entry->pool_bits,
- __entry->input_bits)
-);
-
 TRACE_EVENT(debit_entropy,
TP_PROTO(const char *pool_name, int debit_bits),
 
@@ -161,35 +139,6 @@ TRACE_EVENT(add_disk_randomness,
  MINOR(__entry->dev), __entry->input_bits)
 );
 
-TRACE_EVENT(xfer_secondary_pool,
-   TP_PROTO(const char *pool_name, int xfer_bits, int request_bits,
-int pool_entropy, int input_entropy),
-
-   TP_ARGS(pool_name, xfer_bits, request_bits, pool_entropy,
-   input_entropy),
-
-   TP_STRUCT__entry(
-   __field( const char *,  pool_name   )
-   __field(  int,  xfer_bits   )
-   __field(  int,  request_bits)
-   __field(  int,  pool_entropy)
-   __field(  int,  input_entropy   )
-   ),
-
-   TP_fast_assign(
-   __entry->pool_name  = pool_name;
-   __entry->xfer_bits  = xfer_bits;
-   __entry->request_bits   = request_bits;
-   __entry->pool_entropy   = pool_entropy;
-   __entry->input_entropy  = input_entropy;
-   ),
-
-   TP_printk("pool %s xfer_bits %d request_bits %d pool_entropy %d "
- "input_entropy %d", __entry->pool_name, __entry->xfer_bits,
- __entry->request_bits, __entry->pool_entropy,
- __entry->input_entropy)
-);
-
 DECLARE_EVENT_CLASS(random__get_random_bytes,
TP_PROTO(int nbytes, unsigned long IP),
 
@@ -253,38 

[PATCH RESEND] random: initialize ChaCha20 constants with correct endianness

2021-03-21 Thread Eric Biggers
From: Eric Biggers 

On big endian CPUs, the ChaCha20-based CRNG is using the wrong
endianness for the ChaCha20 constants.

This doesn't matter cryptographically, but technically it means it's not
ChaCha20 anymore.  Fix it to always use the standard constants.

Cc: linux-cry...@vger.kernel.org
Cc: Andy Lutomirski 
Cc: Jann Horn 
Cc: Theodore Ts'o 
Acked-by: Herbert Xu 
Acked-by: Ard Biesheuvel 
Signed-off-by: Eric Biggers 
---
 drivers/char/random.c   | 4 ++--
 include/crypto/chacha.h | 9 +++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0fe9e200e4c84..5d6acfecd919b 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -819,7 +819,7 @@ static bool __init crng_init_try_arch_early(struct 
crng_state *crng)
 
 static void __maybe_unused crng_initialize_secondary(struct crng_state *crng)
 {
-   memcpy(>state[0], "expand 32-byte k", 16);
+   chacha_init_consts(crng->state);
_get_random_bytes(>state[4], sizeof(__u32) * 12);
crng_init_try_arch(crng);
crng->init_time = jiffies - CRNG_RESEED_INTERVAL - 1;
@@ -827,7 +827,7 @@ static void __maybe_unused crng_initialize_secondary(struct 
crng_state *crng)
 
 static void __init crng_initialize_primary(struct crng_state *crng)
 {
-   memcpy(>state[0], "expand 32-byte k", 16);
+   chacha_init_consts(crng->state);
_extract_entropy(_pool, >state[4], sizeof(__u32) * 12, 0);
if (crng_init_try_arch_early(crng) && trust_cpu) {
invalidate_batched_entropy();
diff --git a/include/crypto/chacha.h b/include/crypto/chacha.h
index 3a1c72fdb7cf5..dabaee6987186 100644
--- a/include/crypto/chacha.h
+++ b/include/crypto/chacha.h
@@ -47,13 +47,18 @@ static inline void hchacha_block(const u32 *state, u32 
*out, int nrounds)
hchacha_block_generic(state, out, nrounds);
 }
 
-void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv);
-static inline void chacha_init_generic(u32 *state, const u32 *key, const u8 
*iv)
+static inline void chacha_init_consts(u32 *state)
 {
state[0]  = 0x61707865; /* "expa" */
state[1]  = 0x3320646e; /* "nd 3" */
state[2]  = 0x79622d32; /* "2-by" */
state[3]  = 0x6b206574; /* "te k" */
+}
+
+void chacha_init_arch(u32 *state, const u32 *key, const u8 *iv);
+static inline void chacha_init_generic(u32 *state, const u32 *key, const u8 
*iv)
+{
+   chacha_init_consts(state);
state[4]  = key[0];
state[5]  = key[1];
state[6]  = key[2];
-- 
2.31.0



RE: [Linuxarm] Re: [PATCH] sched/fair: remove redundant test_idle_cores for non-smt

2021-03-21 Thread Song Bao Hua (Barry Song)


> -Original Message-
> From: Li, Aubrey [mailto:aubrey...@linux.intel.com]
> Sent: Monday, March 22, 2021 5:37 PM
> To: Song Bao Hua (Barry Song) ;
> vincent.guit...@linaro.org; mi...@redhat.com; pet...@infradead.org;
> juri.le...@redhat.com; dietmar.eggem...@arm.com; rost...@goodmis.org;
> bseg...@google.com; mgor...@suse.de
> Cc: valentin.schnei...@arm.com; linux-arm-ker...@lists.infradead.org;
> linux-kernel@vger.kernel.org; xuwei (O) ; Zengtao (B)
> ; guodong...@linaro.org; yangyicong
> ; Liguozhu (Kenneth) ;
> linux...@openeuler.org
> Subject: [Linuxarm] Re: [PATCH] sched/fair: remove redundant test_idle_cores
> for non-smt
> 
> Hi Barry,
> 
> On 2021/3/21 6:14, Barry Song wrote:
> > update_idle_core() is only done for the case of sched_smt_present.
> > but test_idle_cores() is done for all machines even those without
> > smt.
> 
> The patch looks good to me.
> May I know for what case we need to keep CONFIG_SCHED_SMT for non-smt
> machines?


Hi Aubrey,

I think the defconfig of arm64 has always enabled
CONFIG_SCHED_SMT:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/configs/defconfig

it is probably true for x86 as well.

I don't think Linux distribution will build a separate kernel
for machines without smt. so basically the kernel depends on
runtime topology parse to figure out if smt is present rather
than depending on a rebuild.


> 
> Thanks,
> -Aubrey
> 
> 
> > this could contribute to up 8%+ hackbench performance loss on a
> > machine like kunpeng 920 which has no smt. this patch removes the
> > redundant test_idle_cores() for non-smt machines.
> >
> > we run the below hackbench with different -g parameter from 2 to
> > 14, for each different g, we run the command 10 times and get the
> > average time:
> > $ numactl -N 0 hackbench -p -T -l 2 -g $1
> >
> > hackbench will report the time which is needed to complete a certain
> > number of messages transmissions between a certain number of tasks,
> > for example:
> > $ numactl -N 0 hackbench -p -T -l 2 -g 10
> > Running in threaded mode with 10 groups using 40 file descriptors each
> > (== 400 tasks)
> > Each sender will pass 2 messages of 100 bytes
> >
> > The below is the result of hackbench w/ and w/o this patch:
> > g=2  4 6   8  10 12  14
> > w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929
> > w/ : 1.8428 3.7436 5.4501 6.9522 8.2882  9.9535 11.3367
> >   +4.1%  +8.3%  +7.3%   +6.3%
> >
> > Signed-off-by: Barry Song 
> > ---
> >  kernel/sched/fair.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 2e2ab1e..de42a32 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6038,9 +6038,11 @@ static inline bool test_idle_cores(int cpu, bool def)
> >  {
> > struct sched_domain_shared *sds;
> >
> > -   sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> > -   if (sds)
> > -   return READ_ONCE(sds->has_idle_cores);
> > +   if (static_branch_likely(_smt_present)) {
> > +   sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> > +   if (sds)
> > +   return READ_ONCE(sds->has_idle_cores);
> > +   }
> >
> > return def;
> >  }

Thanks
Barry



[PATCH v8 1/3] dmaengine: ptdma: Initial driver for the AMD PTDMA

2021-03-21 Thread Sanjay R Mehta
From: Sanjay R Mehta 

Add support for AMD PTDMA controller. It performs high-bandwidth
memory to memory and IO copy operation. Device commands are managed
via a circular queue of 'descriptors', each of which specifies source
and destination addresses for copying a single buffer of data.

Signed-off-by: Sanjay R Mehta 
---
 MAINTAINERS   |   6 +
 drivers/dma/Kconfig   |   2 +
 drivers/dma/Makefile  |   1 +
 drivers/dma/ptdma/Kconfig |  11 ++
 drivers/dma/ptdma/Makefile|  10 ++
 drivers/dma/ptdma/ptdma-dev.c | 290 +++
 drivers/dma/ptdma/ptdma-pci.c | 251 ++
 drivers/dma/ptdma/ptdma.h | 305 ++
 8 files changed, 876 insertions(+)
 create mode 100644 drivers/dma/ptdma/Kconfig
 create mode 100644 drivers/dma/ptdma/Makefile
 create mode 100644 drivers/dma/ptdma/ptdma-dev.c
 create mode 100644 drivers/dma/ptdma/ptdma-pci.c
 create mode 100644 drivers/dma/ptdma/ptdma.h

diff --git a/MAINTAINERS b/MAINTAINERS
index bfc1b86..9a4c04bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -941,6 +941,12 @@ S: Supported
 T: git git://people.freedesktop.org/~agd5f/linux
 F: drivers/gpu/drm/amd/pm/powerplay/
 
++AMD PTDMA DRIVER
++M:Sanjay R Mehta 
++L:dmaeng...@vger.kernel.org
++S:Maintained
++F:drivers/dma/ptdma/
+
 AMD SEATTLE DEVICE TREE SUPPORT
 M: Brijesh Singh 
 M: Suravee Suthikulpanit 
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index d242c76..861492e 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -724,6 +724,8 @@ source "drivers/dma/bestcomm/Kconfig"
 
 source "drivers/dma/mediatek/Kconfig"
 
+source "drivers/dma/ptdma/Kconfig"
+
 source "drivers/dma/qcom/Kconfig"
 
 source "drivers/dma/dw/Kconfig"
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 948a8da..30b8f91 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_DMATEST) += dmatest.o
 obj-$(CONFIG_ALTERA_MSGDMA) += altera-msgdma.o
 obj-$(CONFIG_AMBA_PL08X) += amba-pl08x.o
 obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/
+obj-$(CONFIG_AMD_PTDMA) += ptdma/
 obj-$(CONFIG_AT_HDMAC) += at_hdmac.o
 obj-$(CONFIG_AT_XDMAC) += at_xdmac.o
 obj-$(CONFIG_AXI_DMAC) += dma-axi-dmac.o
diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/ptdma/Kconfig
new file mode 100644
index 000..c4f8c6f
--- /dev/null
+++ b/drivers/dma/ptdma/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config AMD_PTDMA
+   tristate  "AMD PassThru DMA Engine"
+   depends on X86_64 && PCI
+   help
+ Enable support for the AMD PTDMA controller. This controller
+ provides DMA capabilities to performs high bandwidth memory to
+ memory and IO copy operation. It performs DMA transfer through
+ queue based descriptor management. This DMA controller is intended
+ to use with AMD Non-Transparent Bridge devices and not for general
+ purpose peripheral DMA.
diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
new file mode 100644
index 000..320fa82
--- /dev/null
+++ b/drivers/dma/ptdma/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# AMD Passthru DMA driver
+#
+
+obj-$(CONFIG_AMD_PTDMA) += ptdma.o
+
+ptdma-objs := ptdma-dev.o
+
+ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
new file mode 100644
index 000..4617550
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -0,0 +1,290 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthru DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta 
+ * Author: Gary R Hook 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ptdma.h"
+
+/* Human-readable error strings */
+static char *pt_error_codes[] = {
+   "",
+   "ERR 01: ILLEGAL_ENGINE",
+   "ERR 03: ILLEGAL_FUNCTION_TYPE",
+   "ERR 04: ILLEGAL_FUNCTION_MODE",
+   "ERR 06: ILLEGAL_FUNCTION_SIZE",
+   "ERR 08: ILLEGAL_FUNCTION_RSVD",
+   "ERR 09: ILLEGAL_BUFFER_LENGTH",
+   "ERR 10: VLSB_FAULT",
+   "ERR 11: ILLEGAL_MEM_ADDR",
+   "ERR 12: ILLEGAL_MEM_SEL",
+   "ERR 13: ILLEGAL_CONTEXT_ID",
+   "ERR 15: 0xF Reserved",
+   "ERR 18: CMD_TIMEOUT",
+   "ERR 19: IDMA0_AXI_SLVERR",
+   "ERR 20: IDMA0_AXI_DECERR",
+   "ERR 21: 0x15 Reserved",
+   "ERR 22: IDMA1_AXI_SLAVE_FAULT",
+   "ERR 23: IDMA1_AIXI_DECERR",
+   "ERR 24: 0x18 Reserved",
+   "ERR 27: 0x1B Reserved",
+   "ERR 38: ODMA0_AXI_SLVERR",
+   "ERR 39: ODMA0_AXI_DECERR",
+   "ERR 40: 0x28 Reserved",
+   "ERR 41: ODMA1_AXI_SLVERR",
+   "ERR 42: ODMA1_AXI_DECERR",
+   "ERR 43: LSB_PARITY_ERR",
+};
+
+static void pt_log_error(struct pt_device *d, int e)
+{
+   dev_err(d->dev, "PTDMA 

[PATCH v8 0/3] Add support for AMD PTDMA controller driver

2021-03-21 Thread Sanjay R Mehta
From: Sanjay R Mehta 

This patch series add support for AMD PTDMA controller which
performs high bandwidth memory-to-memory and IO copy operation,
performs DMA transfer through queue based descriptor management.

AMD Processor has multiple ptdma device instances with each controller
having single queue. The driver also adds support for for multiple PTDMA
instances, each device will get an unique identifier and uniquely
named resources.

v8:
- splitted the code into different functions, one to find active desc 
  and second to complete and invoke callback.
- used FIELD_PREP & FIELD_GET instead of union struct bitfields.
- modified all style fixes as per the comments.

v7:
- Fixed module warnings reported ( by kernel test robot 
 ).

v6:
- Removed debug artifacts and made the suggested cosmetic changes.
- implemented and used to_pt_chan and to_pt_desc inline functions.
- Removed src and dst address check as framework does this.
- Removed devm_kcalloc() usage and used devm_kzalloc() api.
- Using framework debugfs directory to store dma info.

v5:
- modified code to submit next tranction in ISR itself and removed the 
tasklet.
- implemented .device_synchronize API.
- converted debugfs code by using DEFINE_SHOW_ATTRIBUTE()
- using dbg_dev_root for debugfs root directory.
- removed dma_status from pt_dma_chan
- removed module parameter cmd_queue_lenght.
- removed global device list for multiple devics.
- removed code related to dynamic adding/deleting to device list
- removed pt_add_device and pt_del_device functions

v4:
- modified DMA channel and descriptor management using virt-dma layer
  instead of list based management.
- return only status of the cookie from pt_tx_status
- copyright year changed from 2019 to 2020
- removed dummy code for suspend & resume
- used bitmask and genmask

v3:
- Fixed the sparse warnings.

v2:
- Added controller description in cover letter
- Removed "default m" from Kconfig
- Replaced low_address() and high_address() functions with kernel
  API's lower_32_bits & upper_32_bits().
- Removed the BH handler function pt_core_irq_bh() and instead
  handling transaction in irq handler itself.
- Moved presetting of command queue registers into new function
  "init_cmdq_regs()"
- Removed the kernel thread dependency to submit transaction.
- Increased the hardware command queue size to 32 and adding it
  as a module parameter.
- Removed backlog command queue handling mechanism.
- Removed software command queue handling and instead submitting
  transaction command directly to
  hardware command queue.
- Added tasklet structure variable in "struct pt_device".
  This is used to invoke pt_do_cmd_complete() upon receiving interrupt
  for command completion.
- pt_core_perform_passthru() function parameters are modified and it is
  now used to submit command directly to hardware from dmaengine framew
- Removed below structures, enums, macros and functions, as these value
  constants. Making command submission simple,
   - Removed "union pt_function"  and several macros like PT_VERSION,
 PT_BYTESWAP, PT_CMD_* etc..
   - enum pt_passthru_bitwise, enum pt_passthru_byteswap, enum pt_memty
 struct pt_dma_info, struct pt_data, struct pt_mem, struct pt_passt
 struct pt_op,

Links of the review comments for v7:
1. https://lkml.org/lkml/2020/11/18/351
2. https://lkml.org/lkml/2020/11/18/384

Links of the review comments for v5:
1. https://lkml.org/lkml/2020/7/3/154
2. https://lkml.org/lkml/2020/8/25/431
3. https://lkml.org/lkml/2020/7/3/177
4. https://lkml.org/lkml/2020/7/3/186

Links of the review comments for v5:
1. https://lkml.org/lkml/2020/5/4/42
2. https://lkml.org/lkml/2020/5/4/45
3. https://lkml.org/lkml/2020/5/4/38
4. https://lkml.org/lkml/2020/5/26/70

Links of the review comments for v4:
1. https://lkml.org/lkml/2020/1/24/12
2. https://lkml.org/lkml/2020/1/24/17

Links of the review comments for v2:
1https://lkml.org/lkml/2019/12/27/630
2. https://lkml.org/lkml/2020/1/3/23
3. https://lkml.org/lkml/2020/1/3/314
4. https://lkml.org/lkml/2020/1/10/100

Links of the review comments for v1:
1. https://lkml.org/lkml/2019/9/24/490
2. https://lkml.org/lkml/2019/9/24/399
3. https://lkml.org/lkml/2019/9/24/862
4. https://lkml.org/lkml/2019/9/24/122

Sanjay R Mehta (3):
  dmaengine: ptdma: Initial driver for the AMD PTDMA
  dmaengine: ptdma: register PTDMA controller as a DMA resource
  dmaengine: ptdma: Add debugfs entries for PTDMA

 MAINTAINERS |   6 +
 drivers/dma/Kconfig |   2 +
 drivers/dma/Makefile|   1 +
 

[PATCH v8 3/3] dmaengine: ptdma: Add debugfs entries for PTDMA

2021-03-21 Thread Sanjay R Mehta
From: Sanjay R Mehta 

Expose data about the configuration and operation of the
PTDMA through debugfs entries: device name, capabilities,
configuration, statistics.

Signed-off-by: Sanjay R Mehta 
---
 drivers/dma/ptdma/Makefile|   2 +-
 drivers/dma/ptdma/ptdma-debugfs.c | 115 ++
 drivers/dma/ptdma/ptdma-dev.c |   5 ++
 drivers/dma/ptdma/ptdma.h |   6 ++
 4 files changed, 127 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/ptdma/ptdma-debugfs.c

diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
index a528cb0..ce54102 100644
--- a/drivers/dma/ptdma/Makefile
+++ b/drivers/dma/ptdma/Makefile
@@ -5,6 +5,6 @@
 
 obj-$(CONFIG_AMD_PTDMA) += ptdma.o
 
-ptdma-objs := ptdma-dev.o ptdma-dmaengine.o
+ptdma-objs := ptdma-dev.o ptdma-dmaengine.o ptdma-debugfs.o
 
 ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-debugfs.c 
b/drivers/dma/ptdma/ptdma-debugfs.c
new file mode 100644
index 000..1f69159
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-debugfs.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthrough DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta 
+ * Author: Gary R Hook 
+ */
+
+#include 
+#include 
+
+#include "ptdma.h"
+
+/* DebugFS helpers */
+#defineMAX_NAME_LEN20
+#defineRI_VERSION_NUM  0x003F
+
+#defineRI_NUM_VQM  0x00078000
+#defineRI_NVQM_SHIFT   15
+
+static DEFINE_MUTEX(pt_debugfs_lock);
+
+static int pt_debugfs_info_show(struct seq_file *s, void *p)
+{
+   struct pt_device *pt = s->private;
+   unsigned int regval;
+
+   if (!pt)
+   return 0;
+
+   seq_printf(s, "Device name: %s\n", pt->name);
+   seq_printf(s, "   # Queues: %d\n", 1);
+   seq_printf(s, " # Cmds: %d\n", pt->cmd_count);
+
+   regval = ioread32(pt->io_regs + CMD_PT_VERSION);
+
+   seq_printf(s, "Version: %d\n", regval & RI_VERSION_NUM);
+   seq_puts(s, "Engines:");
+   seq_puts(s, "\n");
+   seq_printf(s, " Queues: %d\n", (regval & RI_NUM_VQM) >> 
RI_NVQM_SHIFT);
+
+   return 0;
+}
+
+/*
+ * Return a formatted buffer containing the current
+ * statistics of queue for PTDMA
+ */
+static int pt_debugfs_stats_show(struct seq_file *s, void *p)
+{
+   struct pt_device *pt = s->private;
+
+   seq_printf(s, "Total Interrupts Handled: %ld\n", pt->total_interrupts);
+
+   return 0;
+}
+
+static int pt_debugfs_queue_show(struct seq_file *s, void *p)
+{
+   struct pt_cmd_queue *cmd_q = s->private;
+   unsigned int regval;
+
+   if (!cmd_q)
+   return 0;
+
+   seq_printf(s, "   Pass-Thru: %ld\n", cmd_q->total_pt_ops);
+
+   regval = ioread32(cmd_q->reg_int_enable);
+
+   seq_puts(s, "  Enabled Interrupts:");
+   if (regval & INT_EMPTY_QUEUE)
+   seq_puts(s, " EMPTY");
+   if (regval & INT_QUEUE_STOPPED)
+   seq_puts(s, " STOPPED");
+   if (regval & INT_ERROR)
+   seq_puts(s, " ERROR");
+   if (regval & INT_COMPLETION)
+   seq_puts(s, " COMPLETION");
+   seq_puts(s, "\n");
+
+   return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(pt_debugfs_info);
+DEFINE_SHOW_ATTRIBUTE(pt_debugfs_queue);
+DEFINE_SHOW_ATTRIBUTE(pt_debugfs_stats);
+
+void ptdma_debugfs_setup(struct pt_device *pt)
+{
+   struct pt_cmd_queue *cmd_q;
+   char name[MAX_NAME_LEN + 1];
+   struct dentry *debugfs_q_instance;
+
+   if (!debugfs_initialized())
+   return;
+
+   debugfs_create_file("info", 0400, pt->dma_dev.dbg_dev_root, pt,
+   _debugfs_info_fops);
+
+   debugfs_create_file("stats", 0600, pt->dma_dev.dbg_dev_root, pt,
+   _debugfs_stats_fops);
+
+   cmd_q = >cmd_q;
+
+   snprintf(name, MAX_NAME_LEN - 1, "q");
+
+   debugfs_q_instance =
+   debugfs_create_dir(name, pt->dma_dev.dbg_dev_root);
+
+   debugfs_create_file("stats", 0600, debugfs_q_instance, cmd_q,
+   _debugfs_queue_fops);
+}
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
index 7122933..ba37b81 100644
--- a/drivers/dma/ptdma/ptdma-dev.c
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -103,6 +103,7 @@ int pt_core_perform_passthru(struct pt_cmd_queue *cmd_q,
struct ptdma_desc desc;
 
cmd_q->cmd_error = 0;
+   cmd_q->total_pt_ops++;
memset(, 0, sizeof(desc));
desc.dw0 = CMD_DESC_DW0_VAL;
desc.length = pt_engine->src_len;
@@ -151,6 +152,7 @@ static irqreturn_t pt_core_irq_handler(int irq, void *data)
u32 status;
 
pt_core_disable_queue_interrupts(pt);
+   pt->total_interrupts++;
 
status = ioread32(cmd_q->reg_interrupt_status);
if (status) {
@@ -272,6 +274,9 @@ int pt_core_init(struct pt_device *pt)
if (ret)

[PATCH v8 2/3] dmaengine: ptdma: register PTDMA controller as a DMA resource

2021-03-21 Thread Sanjay R Mehta
From: Sanjay R Mehta 

Register ptdma queue to Linux dmaengine framework as general-purpose
DMA channels.

Signed-off-by: Sanjay R Mehta 
---
 drivers/dma/ptdma/Kconfig   |   2 +
 drivers/dma/ptdma/Makefile  |   2 +-
 drivers/dma/ptdma/ptdma-dev.c   |  32 +++
 drivers/dma/ptdma/ptdma-dmaengine.c | 480 
 drivers/dma/ptdma/ptdma.h   |  31 +++
 5 files changed, 546 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma/ptdma/ptdma-dmaengine.c

diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/ptdma/Kconfig
index c4f8c6f..7a6bfcd 100644
--- a/drivers/dma/ptdma/Kconfig
+++ b/drivers/dma/ptdma/Kconfig
@@ -2,6 +2,8 @@
 config AMD_PTDMA
tristate  "AMD PassThru DMA Engine"
depends on X86_64 && PCI
+   select DMA_ENGINE
+   select DMA_VIRTUAL_CHANNELS
help
  Enable support for the AMD PTDMA controller. This controller
  provides DMA capabilities to performs high bandwidth memory to
diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/ptdma/Makefile
index 320fa82..a528cb0 100644
--- a/drivers/dma/ptdma/Makefile
+++ b/drivers/dma/ptdma/Makefile
@@ -5,6 +5,6 @@
 
 obj-$(CONFIG_AMD_PTDMA) += ptdma.o
 
-ptdma-objs := ptdma-dev.o
+ptdma-objs := ptdma-dev.o ptdma-dmaengine.o
 
 ptdma-$(CONFIG_PCI) += ptdma-pci.o
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/ptdma/ptdma-dev.c
index 4617550..7122933 100644
--- a/drivers/dma/ptdma/ptdma-dev.c
+++ b/drivers/dma/ptdma/ptdma-dev.c
@@ -124,6 +124,26 @@ static inline void pt_core_enable_queue_interrupts(struct 
pt_device *pt)
iowrite32(SUPPORTED_INTERRUPTS, pt->cmd_q.reg_int_enable);
 }
 
+static void pt_do_cmd_complete(unsigned long data)
+{
+   struct pt_tasklet_data *tdata = (struct pt_tasklet_data *)data;
+   struct pt_cmd *cmd = tdata->cmd;
+   struct pt_cmd_queue *cmd_q = >pt->cmd_q;
+   u32 tail;
+
+   tail = lower_32_bits(cmd_q->qdma_tail + cmd_q->qidx * Q_DESC_SIZE);
+   if (cmd_q->cmd_error) {
+  /*
+   * Log the error and flush the queue by
+   * moving the head pointer
+   */
+   pt_log_error(cmd_q->pt, cmd_q->cmd_error);
+   iowrite32(tail, cmd_q->reg_head_lo);
+   }
+
+   cmd->pt_cmd_callback(cmd->data, cmd->ret);
+}
+
 static irqreturn_t pt_core_irq_handler(int irq, void *data)
 {
struct pt_device *pt = data;
@@ -147,6 +167,7 @@ static irqreturn_t pt_core_irq_handler(int irq, void *data)
}
 
pt_core_enable_queue_interrupts(pt);
+   pt_do_cmd_complete((ulong)>tdata);
 
return IRQ_HANDLED;
 }
@@ -246,8 +267,16 @@ int pt_core_init(struct pt_device *pt)
 
pt_core_enable_queue_interrupts(pt);
 
+   /* Register the DMA engine support */
+   ret = pt_dmaengine_register(pt);
+   if (ret)
+   goto e_dmaengine;
+
return 0;
 
+e_dmaengine:
+   free_irq(pt->pt_irq, pt);
+
 e_dma_alloc:
dma_free_coherent(dev, cmd_q->qsize, cmd_q->qbase, cmd_q->qbase_dma);
 
@@ -264,6 +293,9 @@ void pt_core_destroy(struct pt_device *pt)
struct pt_cmd_queue *cmd_q = >cmd_q;
struct pt_cmd *cmd;
 
+   /* Unregister the DMA engine */
+   pt_dmaengine_unregister(pt);
+
/* Disable and clear interrupts */
pt_core_disable_queue_interrupts(pt);
 
diff --git a/drivers/dma/ptdma/ptdma-dmaengine.c 
b/drivers/dma/ptdma/ptdma-dmaengine.c
new file mode 100644
index 000..9db9923
--- /dev/null
+++ b/drivers/dma/ptdma/ptdma-dmaengine.c
@@ -0,0 +1,480 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Passthrough DMA device driver
+ * -- Based on the CCP driver
+ *
+ * Copyright (C) 2016,2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Sanjay R Mehta 
+ * Author: Gary R Hook 
+ */
+
+#include "ptdma.h"
+#include "../dmaengine.h"
+#include "../virt-dma.h"
+
+static inline struct pt_dma_chan *to_pt_chan(struct dma_chan *dma_chan)
+{
+   return container_of(dma_chan, struct pt_dma_chan, vc.chan);
+}
+
+static inline struct pt_dma_desc *to_pt_desc(struct virt_dma_desc *vd)
+{
+   return container_of(vd, struct pt_dma_desc, vd);
+}
+
+static void pt_free_cmd_resources(struct pt_device *pt,
+ struct list_head *list)
+{
+   struct pt_dma_cmd *cmd, *ctmp;
+
+   list_for_each_entry_safe(cmd, ctmp, list, entry) {
+   list_del(>entry);
+   kmem_cache_free(pt->dma_cmd_cache, cmd);
+   }
+}
+
+static void pt_free_chan_resources(struct dma_chan *dma_chan)
+{
+   struct pt_dma_chan *chan = to_pt_chan(dma_chan);
+
+   vchan_free_chan_resources(>vc);
+}
+
+static void pt_synchronize(struct dma_chan *dma_chan)
+{
+   struct pt_dma_chan *chan = to_pt_chan(dma_chan);
+
+   vchan_synchronize(>vc);
+}
+
+static void pt_do_cleanup(struct virt_dma_desc *vd)
+{
+   struct pt_dma_desc *desc = to_pt_desc(vd);
+   struct pt_device *pt = desc->pt;
+
+   

[PATCH v2 1/5] dt-bindings: timer: Simplify conditional expressions

2021-03-21 Thread Samuel Holland
The sun4i timer IP block has a variable number of interrupts based on
the compatible. Use enums to combine the two sections for the existing
3-interrupt variants, and to simplify adding new compatible strings.

Acked-by: Maxime Ripard 
Signed-off-by: Samuel Holland 
---
 .../timer/allwinner,sun4i-a10-timer.yaml  | 25 ++-
 1 file changed, 7 insertions(+), 18 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml 
b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
index 1c7cf32e7ac2..3462598e609d 100644
--- a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
+++ b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
@@ -34,8 +34,8 @@ allOf:
   - if:
   properties:
 compatible:
-  items:
-const: allwinner,sun4i-a10-timer
+  enum:
+- allwinner,sun4i-a10-timer
 
 then:
   properties:
@@ -46,8 +46,8 @@ allOf:
   - if:
   properties:
 compatible:
-  items:
-const: allwinner,sun8i-a23-timer
+  enum:
+- allwinner,sun8i-a23-timer
 
 then:
   properties:
@@ -58,20 +58,9 @@ allOf:
   - if:
   properties:
 compatible:
-  items:
-const: allwinner,sun8i-v3s-timer
-
-then:
-  properties:
-interrupts:
-  minItems: 3
-  maxItems: 3
-
-  - if:
-  properties:
-compatible:
-  items:
-const: allwinner,suniv-f1c100s-timer
+  enum:
+- allwinner,sun8i-v3s-timer
+- allwinner,suniv-f1c100s-timer
 
 then:
   properties:
-- 
2.26.2



[PATCH v2 4/5] arm64: dts: allwinner: Add sun4i MMIO timer nodes

2021-03-21 Thread Samuel Holland
For a CPU to enter an idle state, some timer must be available to
trigger an IRQ and wake it back up. The local ARM architectural timer is
not sufficient, because that timer stops when the CPU is powered down.
The ARM architectural timer from some other CPU can be used, but doing
so prevents that other CPU from entering an idle state. For all CPUs to
power down at the same time, Linux needs a timer which is not tied to
any CPU.

Hook up the "sun4i" timer so it can be used for this purpose. It runs at
24 MHz, which balances resolution and power consumption.

Signed-off-by: Samuel Holland 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 9 +
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 9 +
 2 files changed, 18 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index 9cac88576975..c89032dfb316 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -798,6 +798,15 @@ uart4_rts_cts_pins: uart4-rts-cts-pins {
};
};
 
+   timer@1c20c00 {
+   compatible = "allwinner,sun50i-a64-timer",
+"allwinner,sun8i-a23-timer";
+   reg = <0x01c20c00 0xa0>;
+   interrupts = ,
+;
+   clocks = <>;
+   };
+
wdt0: watchdog@1c20ca0 {
compatible = "allwinner,sun50i-a64-wdt",
 "allwinner,sun6i-a31-wdt";
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
index 49e979794094..01884b32390d 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
@@ -271,6 +271,15 @@ cpu_speed_grade: cpu-speed-grade@1c {
};
};
 
+   timer@3009000 {
+   compatible = "allwinner,sun50i-h6-timer",
+"allwinner,sun8i-a23-timer";
+   reg = <0x03009000 0xa0>;
+   interrupts = ,
+;
+   clocks = <>;
+   };
+
watchdog: watchdog@30090a0 {
compatible = "allwinner,sun50i-h6-wdt",
 "allwinner,sun6i-a31-wdt";
-- 
2.26.2



[PATCH v2 5/5] arm64: sunxi: Build the sun4i timer driver

2021-03-21 Thread Samuel Holland
While the ARM architectural timer is generatlly the best timer to use,
a non-c3stop timer is needed for cpuidle.

Build the "sun4i" timer driver so it can be used for this purpose.
It is present on all 64-bit sunxi SoCs.

Signed-off-by: Samuel Holland 
---
 arch/arm64/Kconfig.platforms | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index cdfd5fed457f..7f6a66431fa7 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -26,6 +26,7 @@ config ARCH_SUNXI
select IRQ_FASTEOI_HIERARCHY_HANDLERS
select PINCTRL
select RESET_CONTROLLER
+   select SUN4I_TIMER
help
  This enables support for Allwinner sunxi based SoCs like the A64.
 
-- 
2.26.2



[PATCH v2 2/5] dt-bindings: timer: Add compatibles for sun50i timers

2021-03-21 Thread Samuel Holland
The sun50i SoCs contain timer blocks which are useful as broadcast
clockevent sources. They each have 2 interrupts, matching the A23
variant, so add the new compatible strings with the A23 compatible
as a fallback.

Acked-by: Maxime Ripard 
Signed-off-by: Samuel Holland 
---
 .../timer/allwinner,sun4i-a10-timer.yaml| 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml 
b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
index 3462598e609d..53fd24bdc34e 100644
--- a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
+++ b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml
@@ -12,11 +12,18 @@ maintainers:
 
 properties:
   compatible:
-enum:
-  - allwinner,sun4i-a10-timer
-  - allwinner,sun8i-a23-timer
-  - allwinner,sun8i-v3s-timer
-  - allwinner,suniv-f1c100s-timer
+oneOf:
+  - enum:
+  - allwinner,sun4i-a10-timer
+  - allwinner,sun8i-a23-timer
+  - allwinner,sun8i-v3s-timer
+  - allwinner,suniv-f1c100s-timer
+  - items:
+  - enum:
+  - allwinner,sun50i-a64-timer
+  - allwinner,sun50i-h6-timer
+  - allwinner,sun50i-h616-timer
+  - const: allwinner,sun8i-a23-timer
 
   reg:
 maxItems: 1
-- 
2.26.2



[PATCH v2 0/5] arm64: sunxi: Enable the sun4i timer

2021-03-21 Thread Samuel Holland
In preparation for adding CPU idle states, hook up the sun4i timer.
Having a non-c3stop clockevent source available is necessary for all
CPUs to simultaneously enter a local-timer-stop idle state.

Changes from v1:
  - Removed H616 changes (depends on an unmerged patch set)
  - Reworded the patch 4-5 commit messages for clarity
  - Added Acked-by tags

Samuel Holland (5):
  dt-bindings: timer: Simplify conditional expressions
  dt-bindings: timer: Add compatibles for sun50i timers
  arm64: dts: allwinner: a64: Sort watchdog node
  arm64: dts: allwinner: Add sun4i MMIO timer nodes
  arm64: sunxi: Build the sun4i timer driver

 .../timer/allwinner,sun4i-a10-timer.yaml  | 42 +--
 arch/arm64/Kconfig.platforms  |  1 +
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 25 +++
 arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  |  9 
 4 files changed, 46 insertions(+), 31 deletions(-)

-- 
2.26.2



[PATCH v2 3/5] arm64: dts: allwinner: a64: Sort watchdog node

2021-03-21 Thread Samuel Holland
Nodes should be sorted by unit address. Move the watchdog node to the
correct place, so it will be next to the timer node when that is added.

Signed-off-by: Samuel Holland 
---
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi 
b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index 57786fc120c3..9cac88576975 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -798,6 +798,14 @@ uart4_rts_cts_pins: uart4-rts-cts-pins {
};
};
 
+   wdt0: watchdog@1c20ca0 {
+   compatible = "allwinner,sun50i-a64-wdt",
+"allwinner,sun6i-a31-wdt";
+   reg = <0x01c20ca0 0x20>;
+   interrupts = ;
+   clocks = <>;
+   };
+
spdif: spdif@1c21000 {
#sound-dai-cells = <0>;
compatible = "allwinner,sun50i-a64-spdif",
@@ -1321,13 +1329,5 @@ r_rsb: rsb@1f03400 {
#address-cells = <1>;
#size-cells = <0>;
};
-
-   wdt0: watchdog@1c20ca0 {
-   compatible = "allwinner,sun50i-a64-wdt",
-"allwinner,sun6i-a31-wdt";
-   reg = <0x01c20ca0 0x20>;
-   interrupts = ;
-   clocks = <>;
-   };
};
 };
-- 
2.26.2



Re: [PATCH v2] MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again

2021-03-21 Thread Maciej W. Rozycki
On Thu, 18 Mar 2021, Tiezhu Yang wrote:

> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
> index 160b3a8..4b94ec7 100644
> --- a/arch/mips/Kconfig
> +++ b/arch/mips/Kconfig
> @@ -6,6 +6,7 @@ config MIPS
>   select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT
>   select ARCH_HAS_FORTIFY_SOURCE
>   select ARCH_HAS_KCOV
> + select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE

 Hmm, documentation on ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE seems rather 
scarce, but based on my guess shouldn't this be "if !EVA"?

  Maciej


Re: [PATCH] sched/fair: remove redundant test_idle_cores for non-smt

2021-03-21 Thread Li, Aubrey
Hi Barry,

On 2021/3/21 6:14, Barry Song wrote:
> update_idle_core() is only done for the case of sched_smt_present.
> but test_idle_cores() is done for all machines even those without
> smt.

The patch looks good to me.
May I know for what case we need to keep CONFIG_SCHED_SMT for non-smt
machines?

Thanks,
-Aubrey


> this could contribute to up 8%+ hackbench performance loss on a
> machine like kunpeng 920 which has no smt. this patch removes the
> redundant test_idle_cores() for non-smt machines.
> 
> we run the below hackbench with different -g parameter from 2 to
> 14, for each different g, we run the command 10 times and get the
> average time:
> $ numactl -N 0 hackbench -p -T -l 2 -g $1
> 
> hackbench will report the time which is needed to complete a certain
> number of messages transmissions between a certain number of tasks,
> for example:
> $ numactl -N 0 hackbench -p -T -l 2 -g 10
> Running in threaded mode with 10 groups using 40 file descriptors each
> (== 400 tasks)
> Each sender will pass 2 messages of 100 bytes
> 
> The below is the result of hackbench w/ and w/o this patch:
> g=2  4 6   8  10 12  14
> w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929
> w/ : 1.8428 3.7436 5.4501 6.9522 8.2882  9.9535 11.3367
>   +4.1%  +8.3%  +7.3%   +6.3%
> 
> Signed-off-by: Barry Song 
> ---
>  kernel/sched/fair.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2e2ab1e..de42a32 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6038,9 +6038,11 @@ static inline bool test_idle_cores(int cpu, bool def)
>  {
>   struct sched_domain_shared *sds;
>  
> - sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> - if (sds)
> - return READ_ONCE(sds->has_idle_cores);
> + if (static_branch_likely(_smt_present)) {
> + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> + if (sds)
> + return READ_ONCE(sds->has_idle_cores);
> + }
>  
>   return def;
>  }
> 



Re: [RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option

2021-03-21 Thread Suravee Suthikulpanit

Joerg,

On 3/18/21 10:33 PM, Joerg Roedel wrote:

On Fri, Mar 12, 2021 at 03:04:10AM -0600, Suravee Suthikulpanit wrote:

To allow specification whether to use v1 or v2 IOMMU pagetable for
DMA remapping when calling kernel DMA-API.

Signed-off-by: Suravee Suthikulpanit 
---
  Documentation/admin-guide/kernel-parameters.txt |  6 ++
  drivers/iommu/amd/init.c| 15 +++
  2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..466e807369ea 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -319,6 +319,12 @@
 This mode requires kvm-amd.avic=1.
 (Default when IOMMU HW support is present.)
  
+	amd_iommu_pgtable= [HW,X86-64]

+   Specifies one of the following AMD IOMMU page table to
+   be used for DMA remapping for DMA-API:
+   v1 - Use v1 page table (Default)
+   v2 - Use v2 page table


Any reason v2 can not be the default when it is supported by the IOMMU?



Eventually, we should be able to default to v2. However, we will need to make 
sure that
the v2 implementation will have comparable performance as currently used v1.

FYI: I'm also looking into adding support for SVA as well.

Thanks,
Suravee


Re: [PATCH] docs: submitting-patches Fix a typo

2021-03-21 Thread Bhaskar Chowdhury

On 04:02 Mon 22 Mar 2021, Matthew Wilcox wrote:

On Mon, Mar 22, 2021 at 09:18:34AM +0530, Bhaskar Chowdhury wrote:

On 03:44 Mon 22 Mar 2021, Matthew Wilcox wrote:
> On Mon, Mar 22, 2021 at 09:00:00AM +0530, Bhaskar Chowdhury wrote:
> >
> > s/mesages/messages/
>
> did you test the build afterwards?  you forgot to do something.
>
What are you talking about??? It is going over my head...why the build
reqired?? A spello needs a rebuild Wondering


don't argue with me.  just type 'make htmldocs' and find out.


Well, some other time ..I have the habit of arguing with people ..can't help ...




signature.asc
Description: PGP signature


Re: [syzbot] KASAN: use-after-free Read in disk_part_iter_next (2)

2021-03-21 Thread Bart Van Assche
On 3/21/21 7:35 PM, Ming Lei wrote:
> On Mon, Mar 22, 2021 at 7:03 AM Bart Van Assche  wrote:
>>
>> On 3/14/21 4:08 AM, syzbot wrote:
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:280d542f Merge tag 'drm-fixes-2021-03-05' of git://anongit..
>>> git tree:   upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=15ade5aed0
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=952047a9dbff6a6a
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=8fede7e30c7cee0de139
>>
>> #syz test: https://github.com/bvanassche/linux a5f35387ebdc
> 
> It should be the same issue which was addressed by
> 
>aebf5db91705 block: fix use-after-free in disk_part_iter_next
> 
> but converting to xarray introduced the issue again.

Hi Ming,

Since that patch does not re-apply cleanly, do you want to convert that
patch to the latest kernel version or do you perhaps expect me to do that?

Thanks,

Bart.



[ANNOUNCE] 4.9.260-rt174

2021-03-21 Thread Luis Claudio R. Goncalves
Hello RT-list!

I'm pleased to announce the 4.9.260-rt174 stable release.

You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  branch: v4.9-rt
  Head SHA1: a1ce8735f60285bcf3df3ab01e1ea2588e90c540

Or to build 4.9.260-rt174 directly, the following patches should be applied:

  https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.9.tar.xz

  https://www.kernel.org/pub/linux/kernel/v4.x/patch-4.9.260.xz

  
https://www.kernel.org/pub/linux/kernel/projects/rt/4.9/patch-4.9.260-rt174.patch.xz

Enjoy!
Luis



Re: [PATCH] scsi: mpt3sas: Fix a typo

2021-03-21 Thread Randy Dunlap
On 3/21/21 8:21 PM, Bhaskar Chowdhury wrote:
> 
> s/encloure/enclosure/
> 
> Signed-off-by: Bhaskar Chowdhury 

Acked-by: Randy Dunlap 

> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
> b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index ac066f86bb14..398fd07ee9f5 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -5232,7 +5232,7 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc)
>   * mpt3sas_free_enclosure_list - release memory
>   * @ioc: per adapter object
>   *
> - * Free memory allocated during encloure add.
> + * Free memory allocated during enclosure add.
>   */
>  void
>  mpt3sas_free_enclosure_list(struct MPT3SAS_ADAPTER *ioc)
> --


-- 
~Randy



Re: [PATCH V2] xfs: Rudimentary spelling fix

2021-03-21 Thread Randy Dunlap
On 3/21/21 8:45 PM, Bhaskar Chowdhury wrote:
> s/sytemcall/syscall/
> 
> Signed-off-by: Bhaskar Chowdhury 

Acked-by: Randy Dunlap 

> ---
>   Changes from V1:
>Randy's suggestion incorporated.
> 
>  fs/xfs/xfs_inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index f93370bd7b1e..3087d03a6863 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2870,7 +2870,7 @@ xfs_finish_rename(
>  /*
>   * xfs_cross_rename()
>   *
> - * responsible for handling RENAME_EXCHANGE flag in renameat2() sytemcall
> + * responsible for handling RENAME_EXCHANGE flag in renameat2() syscall
>   */
>  STATIC int
>  xfs_cross_rename(
> --


-- 
~Randy



Re: [PATCH] docs: submitting-patches Fix a typo

2021-03-21 Thread Matthew Wilcox
On Mon, Mar 22, 2021 at 09:18:34AM +0530, Bhaskar Chowdhury wrote:
> On 03:44 Mon 22 Mar 2021, Matthew Wilcox wrote:
> > On Mon, Mar 22, 2021 at 09:00:00AM +0530, Bhaskar Chowdhury wrote:
> > > 
> > > s/mesages/messages/
> > 
> > did you test the build afterwards?  you forgot to do something.
> > 
> What are you talking about??? It is going over my head...why the build
> reqired?? A spello needs a rebuild Wondering

don't argue with me.  just type 'make htmldocs' and find out.



Re: [PATCH] mm: Fix typos in comments

2021-03-21 Thread Randy Dunlap
On 3/21/21 7:51 PM, Ingo Molnar wrote:
> 
> Fix ~93 single-word typos in locking code comments, plus a few very 
> obvious grammar mistakes.
> 
> Signed-off-by: Ingo Molnar 
> Cc: Andrew Morton 
> Cc: Rik van Riel 
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/mm.h  |  2 +-
>  include/linux/vmalloc.h |  4 ++--
>  mm/balloon_compaction.c |  4 ++--
>  mm/compaction.c |  2 +-
>  mm/filemap.c|  2 +-
>  mm/gup.c|  2 +-
>  mm/highmem.c|  2 +-
>  mm/huge_memory.c|  4 ++--
>  mm/hugetlb.c|  4 ++--
>  mm/internal.h   |  2 +-
>  mm/kasan/kasan.h|  8 
>  mm/kasan/quarantine.c   |  4 ++--
>  mm/kasan/shadow.c   |  4 ++--
>  mm/kfence/report.c  |  2 +-
>  mm/khugepaged.c |  2 +-
>  mm/kmemleak.c   |  2 +-
>  mm/ksm.c|  4 ++--
>  mm/madvise.c|  4 ++--
>  mm/memcontrol.c | 18 +-
>  mm/memory-failure.c |  2 +-
>  mm/memory.c | 12 ++--
>  mm/mempolicy.c  |  4 ++--
>  mm/migrate.c|  8 
>  mm/mmap.c   |  4 ++--
>  mm/mprotect.c   |  2 +-
>  mm/mremap.c |  2 +-
>  mm/oom_kill.c   |  2 +-
>  mm/page-writeback.c |  4 ++--
>  mm/page_alloc.c | 14 +++---
>  mm/page_owner.c |  2 +-
>  mm/page_reporting.c |  2 +-
>  mm/percpu-internal.h|  2 +-
>  mm/percpu.c |  2 +-
>  mm/pgalloc-track.h  |  6 +++---
>  mm/slab.c   |  8 
>  mm/slub.c   | 10 +-
>  mm/swap_slots.c |  2 +-
>  mm/swap_state.c |  2 +-
>  mm/swapfile.c   |  4 ++--
>  mm/util.c   |  2 +-
>  mm/vmalloc.c|  8 
>  mm/vmstat.c |  2 +-
>  mm/zpool.c  |  2 +-
>  mm/zsmalloc.c   |  2 +-
>  44 files changed, 93 insertions(+), 93 deletions(-)

> diff --git a/mm/compaction.c b/mm/compaction.c
> index e04f4476e68e..048686fba230 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1977,7 +1977,7 @@ static unsigned int fragmentation_score_wmark(pg_data_t 
> *pgdat, bool low)
>   unsigned int wmark_low;
>  
>   /*
> -  * Cap the low watermak to avoid excessive compaction
> +  * Cap the low watermark to avoid excessive compaction
>* activity in case a user sets the proactivess tunable

proactiveness
?

>* close to 100 (maximum).
>*/
> diff --git a/mm/memory.c b/mm/memory.c
> index 5efa07fb6cdc..a0d4fedd5e9b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4463,7 +4463,7 @@ static vm_fault_t __handle_mm_fault(struct 
> vm_area_struct *vma,
>   * @flags: the fault flags.
>   * @ret: the fault retcode.
>   *
> - * This will take care of most of the page fault accountings.  Meanwhile, it
> + * This will take care of most of the page fault accounting.  Meanwhile, it
>   * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
>   * updates.  However note that the handling of PERF_COUNT_SW_PAGE_FAULTS 
> should

However,

>   * still be in per-arch page fault handlers at the entry of page fault.
> diff --git a/mm/page_reporting.c b/mm/page_reporting.c
> index c50d93ffa252..8b9197074632 100644
> --- a/mm/page_reporting.c
> +++ b/mm/page_reporting.c
> @@ -86,7 +86,7 @@ page_reporting_drain(struct page_reporting_dev_info *prdev,
>   continue;
>  
>   /*
> -  * If page was not comingled with another page we can
> +  * If page was not commingled with another page we can

Either spelling seems to be acceptable.

>* consider the result to be "reported" since the page
>* hasn't been modified, otherwise we will need to
>* report on the new larger page when we make our way
> diff --git a/mm/slub.c b/mm/slub.c
> index 3021ce9bf1b3..a48892cc8359 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> diff --git a/mm/swap_slots.c b/mm/swap_slots.c
> index be9de6d5b516..0158aa9c3e55 100644
> --- a/mm/swap_slots.c
> +++ b/mm/swap_slots.c
> @@ -16,7 +16,7 @@
>   * to local caches without needing to acquire swap_info
>   * lock.  We do not reuse the returned slots directly but
>   * move them back to the global pool in a batch.  This
> - * allows the slots to coaellesce and reduce fragmentation.
> + * allows the slots to coalescence and reduce fragmentation.

   to coalesce

>   *
>   * The swap entry allocated is marked with SWAP_HAS_CACHE
>   * flag in map_count that prevents it from being allocated


Mostly looks like a good, big cleanup. Thanks.

-- 
~Randy



Re: [PATCH] mm: Fix typos in comments

2021-03-21 Thread Bhaskar Chowdhury

On 20:52 Sun 21 Mar 2021, Randy Dunlap wrote:

On 3/21/21 8:44 PM, Matthew Wilcox wrote:

On Mon, Mar 22, 2021 at 03:51:52AM +0100, Ingo Molnar wrote:

+++ b/mm/huge_memory.c
@@ -1794,7 +1794,7 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned 
long old_addr,
 /*
  * Returns
  *  - 0 if PMD could not be locked
- *  - 1 if PMD was locked but protections unchange and TLB flush unnecessary
+ *  - 1 if PMD was locked but protections unchanged and TLB flush unnecessary
  *  - HPAGE_PMD_NR is protections changed and TLB flush necessary


s/is/if/


@@ -5306,7 +5306,7 @@ void adjust_range_if_pmd_sharing_possible(struct 
vm_area_struct *vma,

/*
 * vma need span at least one aligned PUD size and the start,end range
-* must at least partialy within it.
+* must at least partially within it.


 * vma needs to span at least one aligned PUD size, and the range
 * must be at least partially within in.


 /*
  * swapon tell device that all the old swap contents can be discarded,
- * to allow the swap device to optimize its wear-levelling.
+ * to allow the swap device to optimize its wear-leveling.
  */


Levelling is british english, leveling is american english.  we don't
usually "correct" one into the other.


How about "labelled" (from mm/kasan/shadow.c):



Not sure , "levelling" and "labelling" is the same thing...I think all of us
missed the context ...isn't that dictated by the context(soruce code,
effecets) ...


@@ -384,7 +384,7 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, 
unsigned long addr,
 * How does this work?
 * ---
 *
- * We have a region that is page aligned, labelled as A.
+ * We have a region that is page aligned, labeled as A.
 * That might not map onto the shadow in a way that is page-aligned:


--
~Randy



signature.asc
Description: PGP signature


[no subject]

2021-03-21 Thread Xu Yihang


Git message updated.



[PATCH -next] x86: Fix unused variable 'hi'

2021-03-21 Thread Xu Yihang
Fixes the following W=1 kernel build warning(s):
arch/x86/hyperv/hv_apic.c:58:15: warning: variable ‘hi’ set but not used 
[-Wunused-but-set-variable]

Compiled with CONFIG_HYPERV enabled:
make allmodconfig ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu-
make W=1 arch/x86/hyperv/hv_apic.o ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu-

HV_X64_MSR_EOI stores on bit 31:0 and HV_X64_MSR_TPR stores in bit 7:0, which 
means higher 32 bits are not really used, therefore __maybe_unused added.

Reported-by: Hulk Robot 
Signed-off-by: Xu Yihang 
---
 arch/x86/hyperv/hv_apic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 284e73661a18..c0b0a5774f31 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -55,7 +55,8 @@ static void hv_apic_icr_write(u32 low, u32 id)
 
 static u32 hv_apic_read(u32 reg)
 {
-   u32 reg_val, hi;
+   u32 hi __maybe_unused;
+   u32 reg_val;
 
switch (reg) {
case APIC_EOI:
-- 
2.17.1



Re: [PATCH v4 01/25] mm: Introduce struct folio

2021-03-21 Thread Matthew Wilcox
On Mon, Mar 22, 2021 at 12:52:40PM +1000, Nicholas Piggin wrote:
> Excerpts from Matthew Wilcox's message of March 19, 2021 11:25 am:
> > On Fri, Mar 19, 2021 at 10:56:45AM +1100, Balbir Singh wrote:
> >> On Fri, Mar 05, 2021 at 04:18:37AM +, Matthew Wilcox (Oracle) wrote:
> >> > A struct folio refers to an entire (possibly compound) page.  A function
> >> > which takes a struct folio argument declares that it will operate on the
> >> > entire compound page, not just PAGE_SIZE bytes.  In return, the caller
> >> > guarantees that the pointer it is passing does not point to a tail page.
> >> >
> >> 
> >> Is this a part of a larger use case or general cleanup/refactor where
> >> the split between page and folio simplify programming?
> > 
> > The goal here is to manage memory in larger chunks.  Pages are now too
> > small for just about every workload.  Even compiling the kernel sees a 7%
> > performance improvement just by doing readahead using relatively small
> > THPs (16k-256k).  You can see that work here:
> > https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/master
> 
> The 7% improvement comes from cache cold kbuild by improving IO
> patterns?
> 
> Just wondering what kind of readahead is enabled by this that can't
> be done with base page size.

I see my explanation earlier was confusing.  What I meant to say
was that the only way in that patch set to create larger pages was
at readahead time.  Writes were incapable of creating larger pages.
Once pages were in the page cache, they got managed at that granularity
unless they got split by a truncate/holepunch/io-error/...

I don't have good perf runs of kernbench to say exactly where we got the
benefit.  My assumption is that because we're managing an entire, say,
256kB page as a single unit on the LRU list, we benefit from lower LRU
lock contention.  There's also the benefit of batching, eg, allocating
a single 256kB page from the page allocator may well be more effective
than allocating 64 4kB pages.


[PATCH v3] exfat: speed up iterate/lookup by fixing start point of traversing cluster chain

2021-03-21 Thread Hyeongseok Kim
When directory iterate and lookup is called, there's a buggy rewinding
of start point for traversing cluster chain to the parent directory
entry's first cluster. This caused repeated cluster chain traversing
from the first entry of the parent directory that would show worse
performance if huge amounts of files exist under the parent directory.
Fix not to rewind, make continue from currently referenced cluster and
dir entry.

Tested with 50,000 files under single directory / 256GB sdcard,
with command "time ls -l > /dev/null",
Before : 0m08.69s real 0m00.27s user 0m05.91s system
After  : 0m07.01s real 0m00.25s user 0m04.34s system

Signed-off-by: Hyeongseok Kim 
---
 fs/exfat/dir.c  | 19 +--
 fs/exfat/exfat_fs.h |  2 +-
 fs/exfat/namei.c|  9 -
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
index 7efb1c6d4808..c4523648472a 100644
--- a/fs/exfat/dir.c
+++ b/fs/exfat/dir.c
@@ -147,7 +147,7 @@ static int exfat_readdir(struct inode *inode, loff_t *cpos, 
struct exfat_dir_ent
0);
 
*uni_name.name = 0x0;
-   exfat_get_uniname_from_ext_entry(sb, , dentry,
+   exfat_get_uniname_from_ext_entry(sb, , i,
uni_name.name);
exfat_utf16_to_nls(sb, _name,
dir_entry->namebuf.lfn,
@@ -911,14 +911,19 @@ enum {
 };
 
 /*
- * return values:
- *   >= 0  : return dir entiry position with the name in dir
- *   -ENOENT   : entry with the name does not exist
- *   -EIO  : I/O error
+ * @ei: inode info of parent directory
+ * @p_dir:  directory structure of parent directory
+ * @num_entries:entry size of p_uniname
+ * @hint_opt:   If p_uniname is found, filled with optimized dir/entry
+ *  for traversing cluster chain.
+ * @return:
+ *   >= 0:  file directory entry position where the name exists
+ *   -ENOENT:   entry with the name does not exist
+ *   -EIO:  I/O error
  */
 int exfat_find_dir_entry(struct super_block *sb, struct exfat_inode_info *ei,
struct exfat_chain *p_dir, struct exfat_uni_name *p_uniname,
-   int num_entries, unsigned int type)
+   int num_entries, unsigned int type, struct exfat_hint *hint_opt)
 {
int i, rewind = 0, dentry = 0, end_eidx = 0, num_ext = 0, len;
int order, step, name_len = 0;
@@ -995,6 +1000,8 @@ int exfat_find_dir_entry(struct super_block *sb, struct 
exfat_inode_info *ei,
 
if (entry_type == TYPE_FILE || entry_type == TYPE_DIR) {
step = DIRENT_STEP_FILE;
+   hint_opt->clu = clu.dir;
+   hint_opt->eidx = i;
if (type == TYPE_ALL || type == entry_type) {
num_ext = ep->dentry.file.num_ext;
step = DIRENT_STEP_STRM;
diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index e77fe2f45cf2..1d6da61157c9 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -457,7 +457,7 @@ void exfat_update_dir_chksum_with_entry_set(struct 
exfat_entry_set_cache *es);
 int exfat_calc_num_entries(struct exfat_uni_name *p_uniname);
 int exfat_find_dir_entry(struct super_block *sb, struct exfat_inode_info *ei,
struct exfat_chain *p_dir, struct exfat_uni_name *p_uniname,
-   int num_entries, unsigned int type);
+   int num_entries, unsigned int type, struct exfat_hint 
*hint_opt);
 int exfat_alloc_new_dir(struct inode *inode, struct exfat_chain *clu);
 int exfat_find_location(struct super_block *sb, struct exfat_chain *p_dir,
int entry, sector_t *sector, int *offset);
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 1f7b3dc66fcd..24b41103d1cc 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -596,6 +596,8 @@ static int exfat_find(struct inode *dir, struct qstr *qname,
struct exfat_inode_info *ei = EXFAT_I(dir);
struct exfat_dentry *ep, *ep2;
struct exfat_entry_set_cache *es;
+   /* for optimized dir & entry to prevent long traverse of cluster chain 
*/
+   struct exfat_hint hint_opt;
 
if (qname->len == 0)
return -ENOENT;
@@ -619,7 +621,7 @@ static int exfat_find(struct inode *dir, struct qstr *qname,
 
/* search the file name for directories */
dentry = exfat_find_dir_entry(sb, ei, , _name,
-   num_entries, TYPE_ALL);
+   num_entries, TYPE_ALL, _opt);
 
if (dentry < 0)
return dentry; /* -error value */
@@ -628,6 +630,11 @@ static int exfat_find(struct inode *dir, struct qstr 
*qname,
info->entry = dentry;
info->num_subdirs = 0;
 
+   /* adjust cdir to the optimized value */
+   

[PATCH] Input: serio - make write method mandatory

2021-03-21 Thread Dmitry Torokhov
Given that all serio drivers except one implement write() method
let's make it mandatory to avoid testing for its presence whenever
we attempt to use it.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/serio/ams_delta_serio.c | 6 ++
 drivers/input/serio/serio.c   | 5 +
 include/linux/serio.h | 5 +
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/input/serio/ams_delta_serio.c 
b/drivers/input/serio/ams_delta_serio.c
index 1c0be299f179..a1c314897951 100644
--- a/drivers/input/serio/ams_delta_serio.c
+++ b/drivers/input/serio/ams_delta_serio.c
@@ -89,6 +89,11 @@ static irqreturn_t ams_delta_serio_interrupt(int irq, void 
*dev_id)
return IRQ_HANDLED;
 }
 
+static int ams_delta_serio_write(struct serio *serio, u8 data)
+{
+   return -EINVAL;
+}
+
 static int ams_delta_serio_open(struct serio *serio)
 {
struct ams_delta_serio *priv = serio->port_data;
@@ -157,6 +162,7 @@ static int ams_delta_serio_init(struct platform_device 
*pdev)
priv->serio = serio;
 
serio->id.type = SERIO_8042;
+   serio->write = ams_delta_serio_write;
serio->open = ams_delta_serio_open;
serio->close = ams_delta_serio_close;
strlcpy(serio->name, "AMS DELTA keyboard adapter", sizeof(serio->name));
diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
index 29f491082926..8d229a11bb6b 100644
--- a/drivers/input/serio/serio.c
+++ b/drivers/input/serio/serio.c
@@ -694,6 +694,11 @@ EXPORT_SYMBOL(serio_reconnect);
  */
 void __serio_register_port(struct serio *serio, struct module *owner)
 {
+   if (!serio->write) {
+   pr_err("%s: refusing to register %s without write method\n",
+  __func__, serio->name);
+   return;
+   }
serio_init_port(serio);
serio_queue_event(serio, owner, SERIO_REGISTER_PORT);
 }
diff --git a/include/linux/serio.h b/include/linux/serio.h
index 6c27d413da92..075f1b8d76fa 100644
--- a/include/linux/serio.h
+++ b/include/linux/serio.h
@@ -121,10 +121,7 @@ void serio_unregister_driver(struct serio_driver *drv);
 
 static inline int serio_write(struct serio *serio, unsigned char data)
 {
-   if (serio->write)
-   return serio->write(serio, data);
-   else
-   return -1;
+   return serio->write(serio, data);
 }
 
 static inline void serio_drv_write_wakeup(struct serio *serio)
-- 
2.31.0.rc2.261.g7f71774620-goog


-- 
Dmitry


Re: [PATCH] mm: Fix typos in comments

2021-03-21 Thread Randy Dunlap
On 3/21/21 8:44 PM, Matthew Wilcox wrote:
> On Mon, Mar 22, 2021 at 03:51:52AM +0100, Ingo Molnar wrote:
>> +++ b/mm/huge_memory.c
>> @@ -1794,7 +1794,7 @@ bool move_huge_pmd(struct vm_area_struct *vma, 
>> unsigned long old_addr,
>>  /*
>>   * Returns
>>   *  - 0 if PMD could not be locked
>> - *  - 1 if PMD was locked but protections unchange and TLB flush unnecessary
>> + *  - 1 if PMD was locked but protections unchanged and TLB flush 
>> unnecessary
>>   *  - HPAGE_PMD_NR is protections changed and TLB flush necessary
> 
> s/is/if/
> 
>> @@ -5306,7 +5306,7 @@ void adjust_range_if_pmd_sharing_possible(struct 
>> vm_area_struct *vma,
>>  
>>  /*
>>   * vma need span at least one aligned PUD size and the start,end range
>> - * must at least partialy within it.
>> + * must at least partially within it.
> 
>* vma needs to span at least one aligned PUD size, and the range
>* must be at least partially within in.
> 
>>  /*
>>   * swapon tell device that all the old swap contents can be discarded,
>> - * to allow the swap device to optimize its wear-levelling.
>> + * to allow the swap device to optimize its wear-leveling.
>>   */
> 
> Levelling is british english, leveling is american english.  we don't
> usually "correct" one into the other.

How about "labelled" (from mm/kasan/shadow.c):

@@ -384,7 +384,7 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, 
unsigned long addr,
  * How does this work?
  * ---
  *
- * We have a region that is page aligned, labelled as A.
+ * We have a region that is page aligned, labeled as A.
  * That might not map onto the shadow in a way that is page-aligned:


-- 
~Randy



Re: [PATCH] docs: submitting-patches Fix a typo

2021-03-21 Thread Bhaskar Chowdhury

On 03:44 Mon 22 Mar 2021, Matthew Wilcox wrote:

On Mon, Mar 22, 2021 at 09:00:00AM +0530, Bhaskar Chowdhury wrote:


s/mesages/messages/


did you test the build afterwards?  you forgot to do something.


What are you talking about??? It is going over my head...why the build
reqired?? A spello needs a rebuild Wondering


Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/process/submitting-patches.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/submitting-patches.rst 
b/Documentation/process/submitting-patches.rst
index 91de63b201c1..8b2676527b7e 100644
--- a/Documentation/process/submitting-patches.rst
+++ b/Documentation/process/submitting-patches.rst
@@ -679,7 +679,7 @@ generates appropriate diffstats by default.)
 See more details on the proper patch format in the following
 references.

-Backtraces in commit mesages
+Backtraces in commit messages
 

 Backtraces help document the call chain leading to a problem. However,
--
2.31.0



signature.asc
Description: PGP signature


Re: [GIT PULL] ext4 fixes for v5.12

2021-03-21 Thread Gao Xiang
On Sun, Mar 21, 2021 at 11:36:10PM -0400, Theodore Ts'o wrote:
> On Mon, Mar 22, 2021 at 11:05:13AM +0800, Gao Xiang wrote:
> > I think the legel name would be "Zhang Yi" (family name goes first [1])
> > according to
> > The Chinese phonetic alphabet spelling rules for Chinese names [2].
> > 
> > Indeed, that is also what the legel name is written in alphabet on our
> > passport or credit/debit cards.
> > 
> > Also, many official English-written materials use it in that way, for
> > example, a somewhat famous bastetball player Yao Ming [3][4][5].
> > 
> > That is what I wrote my own name as this but I also noticed the western
> > ordering of names is quite common for Chinese people in Linux kernel.
> > Anyway, it's just my preliminary personal thought (might be just my
> > own perference) according to (I think, maybe) formal occasions.
> 

Hi Ted,

> Yeah, there doesn't seem to be a lot of consistency with the ordering
> of Chinese names when they are written in Roman characters.  Some
> people put the family name first, and other people will put the
> personal (first) name first.  In some cases it may be because the
> developer in question is living in America, and so they've decided to
> use the American naming convention.  (Two example of this are former
> ext4 developers Mingming Cao and Jiaying Zhang, who live in Portland
> and Los Angelos, and their family names are Cao and Zhang,
> respectively.)

Yes, totally agree. I think that's all our own perference honestly
(yet just showed some local official materials though.)

> 
> My personal opinion is people should use whatever name they are
> comfortable with, using whatever characters they prefer.  The one

Totally agree.

> thing that would be helpful for me is for people to give a hint about
> how they would prefer to be called --- for example, would you prefer
> that start an e-mail with the salutation, "Hi Gao", "Hi Xiang", or "Hi
> Gao Xiang"?

Honestly, I think either way would be fine on this even in Chinese
speaking environment...

> 
> And if I don't know, and I guess wrong, please feel free to correct
> me, either privately, or publically on the e-mail list, if you think
> it would be helpful for more people to understand how you'd prefer to
> be called.

Nope, it's just a minor stuff though. I didn't tend to give any direct/
indirect opinion or hint on this. Sorry about that if some misleading :)

Thanks,
Gao Xiang

> 
> Cheers,
> 
>   - Ted
> 



Re: [PATCH] xfs: Rudimentary spelling fix

2021-03-21 Thread Bhaskar Chowdhury

On 20:33 Sun 21 Mar 2021, Darrick J. Wong wrote:

On Sun, Mar 21, 2021 at 07:52:41PM -0700, Randy Dunlap wrote:

On 3/21/21 7:46 PM, Bhaskar Chowdhury wrote:
>
> s/sytemcall/systemcall/
>
>
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  fs/xfs/xfs_inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index f93370bd7b1e..b5eef9f09c00 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2870,7 +2870,7 @@ xfs_finish_rename(
>  /*
>   * xfs_cross_rename()
>   *
> - * responsible for handling RENAME_EXCHANGE flag in renameat2() sytemcall
> + * responsible for handling RENAME_EXCHANGE flag in renameat2() systemcall
>   */
>  STATIC int
>  xfs_cross_rename(
> --

I'll leave it up to Darrick or someone else.

To me it's "syscall" or "system call".


Agreed; could you change it to one of Randy's suggestions, please?


Sent out a V2. pls check.

--D


--
~Randy



signature.asc
Description: PGP signature


Re: [PATCH] docs: submitting-patches Fix a typo

2021-03-21 Thread Matthew Wilcox
On Mon, Mar 22, 2021 at 09:00:00AM +0530, Bhaskar Chowdhury wrote:
> 
> s/mesages/messages/

did you test the build afterwards?  you forgot to do something.

> Signed-off-by: Bhaskar Chowdhury 
> ---
>  Documentation/process/submitting-patches.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/process/submitting-patches.rst 
> b/Documentation/process/submitting-patches.rst
> index 91de63b201c1..8b2676527b7e 100644
> --- a/Documentation/process/submitting-patches.rst
> +++ b/Documentation/process/submitting-patches.rst
> @@ -679,7 +679,7 @@ generates appropriate diffstats by default.)
>  See more details on the proper patch format in the following
>  references.
> 
> -Backtraces in commit mesages
> +Backtraces in commit messages
>  
> 
>  Backtraces help document the call chain leading to a problem. However,
> --
> 2.31.0
> 


[PATCH V2] xfs: Rudimentary spelling fix

2021-03-21 Thread Bhaskar Chowdhury
s/sytemcall/syscall/

Signed-off-by: Bhaskar Chowdhury 
---
  Changes from V1:
   Randy's suggestion incorporated.

 fs/xfs/xfs_inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index f93370bd7b1e..3087d03a6863 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2870,7 +2870,7 @@ xfs_finish_rename(
 /*
  * xfs_cross_rename()
  *
- * responsible for handling RENAME_EXCHANGE flag in renameat2() sytemcall
+ * responsible for handling RENAME_EXCHANGE flag in renameat2() syscall
  */
 STATIC int
 xfs_cross_rename(
--
2.31.0



linux-next: manual merge of the ftrace tree with the tip tree

2021-03-21 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the ftrace tree got a conflict in:

  arch/x86/kernel/kprobes/ftrace.c

between commit:

  d9f6e12fb0b7 ("x86: Fix various typos in comments")

from the tip tree and commit:

  e0196ae73234 ("ftrace: Fix spelling mistake "disabed" -> "disabled"")

from the ftrace tree.

I fixed it up (I used the former - it fixed a second typo in the same
comment) and can carry the fix as necessary. This is now fixed as far
as linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpD5XsX5Tizc.pgp
Description: OpenPGP digital signature


Re: [PATCH] mm: Fix typos in comments

2021-03-21 Thread Matthew Wilcox
On Mon, Mar 22, 2021 at 03:51:52AM +0100, Ingo Molnar wrote:
> +++ b/mm/huge_memory.c
> @@ -1794,7 +1794,7 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned 
> long old_addr,
>  /*
>   * Returns
>   *  - 0 if PMD could not be locked
> - *  - 1 if PMD was locked but protections unchange and TLB flush unnecessary
> + *  - 1 if PMD was locked but protections unchanged and TLB flush unnecessary
>   *  - HPAGE_PMD_NR is protections changed and TLB flush necessary

s/is/if/

> @@ -5306,7 +5306,7 @@ void adjust_range_if_pmd_sharing_possible(struct 
> vm_area_struct *vma,
>  
>   /*
>* vma need span at least one aligned PUD size and the start,end range
> -  * must at least partialy within it.
> +  * must at least partially within it.

 * vma needs to span at least one aligned PUD size, and the range
 * must be at least partially within in.

>  /*
>   * swapon tell device that all the old swap contents can be discarded,
> - * to allow the swap device to optimize its wear-levelling.
> + * to allow the swap device to optimize its wear-leveling.
>   */

Levelling is british english, leveling is american english.  we don't
usually "correct" one into the other.

Reviewed-by: Matthew Wilcox (Oracle) 


Re: [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment

2021-03-21 Thread Jason Wang



在 2021/3/18 下午9:52, Connor Kuehl 写道:

'Maxmum' -> 'Maximum'



Need a better log here.

With the commit log fixed.

Acked-by: Jason Wang 




Signed-off-by: Connor Kuehl 
---
  fs/fuse/fuse_i.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index f0e4ee906464..8bdee79ba593 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -552,7 +552,7 @@ struct fuse_conn {
/** Maximum write size */
unsigned max_write;
  
-	/** Maxmum number of pages that can be used in a single request */

+   /** Maximum number of pages that can be used in a single request */
unsigned int max_pages;
  
  #if IS_ENABLED(CONFIG_VIRTIO_FS)




Re: [PATCH v3] mm/gup: check page posion status for coredump.

2021-03-21 Thread Aili Yao
On Sat, 20 Mar 2021 00:35:16 +
Matthew Wilcox  wrote:

> On Fri, Mar 19, 2021 at 10:44:37AM +0800, Aili Yao wrote:
> > +++ b/mm/gup.c
> > @@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long addr)
> >   FOLL_FORCE | FOLL_DUMP | FOLL_GET);
> > if (locked)
> > mmap_read_unlock(mm);
> > +
> > +   if (ret == 1 && is_page_poisoned(page))
> > +   return NULL;
> > +
> > return (ret == 1) ? page : NULL;
> >  }
> >  #endif /* CONFIG_ELF_CORE */
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 25d2b2439..902d993 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -97,6 +97,27 @@ static inline void set_page_refcounted(struct page *page)
> > set_page_count(page, 1);
> >  }
> >  
> > +/*
> > + * When kernel touch the user page, the user page may be have been marked
> > + * poison but still mapped in user space, if without this page, the kernel
> > + * can guarantee the data integrity and operation success, the kernel is
> > + * better to check the posion status and avoid touching it, be good not to
> > + * panic, coredump for process fatal signal is a sample case matching this
> > + * scenario. Or if kernel can't guarantee the data integrity, it's better
> > + * not to call this function, let kernel touch the poison page and get to
> > + * panic.
> > + */
> > +static inline bool is_page_poisoned(struct page *page)
> > +{
> > +   if (page != NULL) {  
> 
> Why are you checking page for NULL here?  How can it possibly be NULL?

For this get_dump_page() case, it can't be NULL, I thougt may other place
will call this function and may not guarantee this, But yes, kernel is a more
safer place and checking page NULL is not a common behavior.

Better to remove it, Thanks you very much for pointing this!

> > +   if (PageHWPoison(page))
> > +   return true;
> > +   else if (PageHuge(page) && PageHWPoison(compound_head(page)))
> > +   return true;
> > +   }
> > +   return 0;
> > +}
> > +
> >  extern unsigned long highest_memmap_pfn;
> >  
> >  /*
> > -- 
> > 1.8.3.1
> > 
> >   
-- 
Thanks!
Aili Yao


linux-next: build failure after merge of the tip tree

2021-03-21 Thread Stephen Rothwell
Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

arch/x86/net/bpf_jit_comp.c: In function 'arch_prepare_bpf_trampoline':
arch/x86/net/bpf_jit_comp.c:2015:16: error: 'ideal_nops' undeclared (first use 
in this function)
 2015 |   memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE);
  |^~
arch/x86/net/bpf_jit_comp.c:2015:16: note: each undeclared identifier is 
reported only once for each function it appears in
arch/x86/net/bpf_jit_comp.c:2015:27: error: 'NOP_ATOMIC5' undeclared (first use 
in this function); did you mean 'GFP_ATOMIC'?
 2015 |   memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE);
  |   ^~~
  |   GFP_ATOMIC

Caused by commit

  a89dfde3dc3c ("x86: Remove dynamic NOP selection")

interacting with commit

  b90829704780 ("bpf: Use NOP_ATOMIC5 instead of emit_nops(, 5) for 
BPF_TRAMP_F_CALL_ORIG")

from the net tree.

I have applied the following merge fix patch.

From: Stephen Rothwell 
Date: Mon, 22 Mar 2021 14:30:37 +1100
Subject: [PATCH] x86: fix up for "bpf: Use NOP_ATOMIC5 instead of
 emit_nops(, 5) for BPF_TRAMP_F_CALL_ORIG"

Signed-off-by: Stephen Rothwell 
---
 arch/x86/net/bpf_jit_comp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index db50ab14df67..e2b5da5d441d 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2012,7 +2012,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image 
*im, void *image, void *i
/* remember return value in a stack for bpf prog to access */
emit_stx(, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
im->ip_after_call = prog;
-   memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE);
+   memcpy(prog, x86_nops[5], X86_PATCH_SIZE);
prog += X86_PATCH_SIZE;
}
 
-- 
2.30.0

-- 
Cheers,
Stephen Rothwell


pgp0wYvH9oZQz.pgp
Description: OpenPGP digital signature


Re: [GIT PULL] ext4 fixes for v5.12

2021-03-21 Thread Theodore Ts'o
On Mon, Mar 22, 2021 at 11:05:13AM +0800, Gao Xiang wrote:
> I think the legel name would be "Zhang Yi" (family name goes first [1])
> according to
> The Chinese phonetic alphabet spelling rules for Chinese names [2].
> 
> Indeed, that is also what the legel name is written in alphabet on our
> passport or credit/debit cards.
> 
> Also, many official English-written materials use it in that way, for
> example, a somewhat famous bastetball player Yao Ming [3][4][5].
> 
> That is what I wrote my own name as this but I also noticed the western
> ordering of names is quite common for Chinese people in Linux kernel.
> Anyway, it's just my preliminary personal thought (might be just my
> own perference) according to (I think, maybe) formal occasions.

Yeah, there doesn't seem to be a lot of consistency with the ordering
of Chinese names when they are written in Roman characters.  Some
people put the family name first, and other people will put the
personal (first) name first.  In some cases it may be because the
developer in question is living in America, and so they've decided to
use the American naming convention.  (Two example of this are former
ext4 developers Mingming Cao and Jiaying Zhang, who live in Portland
and Los Angelos, and their family names are Cao and Zhang,
respectively.)

My personal opinion is people should use whatever name they are
comfortable with, using whatever characters they prefer.  The one
thing that would be helpful for me is for people to give a hint about
how they would prefer to be called --- for example, would you prefer
that start an e-mail with the salutation, "Hi Gao", "Hi Xiang", or "Hi
Gao Xiang"?

And if I don't know, and I guess wrong, please feel free to correct
me, either privately, or publically on the e-mail list, if you think
it would be helpful for more people to understand how you'd prefer to
be called.

Cheers,

- Ted


Re: [PATCH] xfs: Rudimentary spelling fix

2021-03-21 Thread Darrick J. Wong
On Sun, Mar 21, 2021 at 07:52:41PM -0700, Randy Dunlap wrote:
> On 3/21/21 7:46 PM, Bhaskar Chowdhury wrote:
> > 
> > s/sytemcall/systemcall/
> > 
> > 
> > Signed-off-by: Bhaskar Chowdhury 
> > ---
> >  fs/xfs/xfs_inode.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index f93370bd7b1e..b5eef9f09c00 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -2870,7 +2870,7 @@ xfs_finish_rename(
> >  /*
> >   * xfs_cross_rename()
> >   *
> > - * responsible for handling RENAME_EXCHANGE flag in renameat2() sytemcall
> > + * responsible for handling RENAME_EXCHANGE flag in renameat2() systemcall
> >   */
> >  STATIC int
> >  xfs_cross_rename(
> > --
> 
> I'll leave it up to Darrick or someone else.
> 
> To me it's "syscall" or "system call".

Agreed; could you change it to one of Randy's suggestions, please?

--D

> -- 
> ~Randy
> 


[PATCH v4] gpio: mpc8xxx: Add ACPI support

2021-03-21 Thread Ran Wang
Current implementation only supports DT, now add ACPI support.

Signed-off-by: Ran Wang 
---
Change in v4:
 - Update error print for gpiochip_add_data() to fix wrong info. in ACPI case.
 - Update error print for devm_request_irq() to fix panic in ACPI case.
 - Add include property.h and mod_devicetable.h.
 - Correct error handling for mpc8xxx_gc->regs.
 - Replace "!(IS_ERR_OR_NULL(fwnode) || is_of_node(fwnode)))" with 
"is_acpi_node(fwnode)"

Change in v3:
 - Recover ls1028a and ls1088a compatilbe checking logic

Change in v2:
 - Initialize devtype with NULL to fix compile warning.
 - Replace of_device_get_match_data() and acpi_match_device with 
device_get_match_data().
 - Replace acpi_match_device() with simpler checking logic per Andy's 
suggestion.

 drivers/gpio/gpio-mpc8xxx.c | 47 ++---
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c
index 6dfca83bcd90..4b9157a69fca 100644
--- a/drivers/gpio/gpio-mpc8xxx.c
+++ b/drivers/gpio/gpio-mpc8xxx.c
@@ -9,6 +9,7 @@
  * kind, whether express or implied.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -18,6 +19,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -303,8 +306,8 @@ static int mpc8xxx_probe(struct platform_device *pdev)
struct device_node *np = pdev->dev.of_node;
struct mpc8xxx_gpio_chip *mpc8xxx_gc;
struct gpio_chip*gc;
-   const struct mpc8xxx_gpio_devtype *devtype =
-   of_device_get_match_data(>dev);
+   const struct mpc8xxx_gpio_devtype *devtype = NULL;
+   struct fwnode_handle *fwnode;
int ret;
 
mpc8xxx_gc = devm_kzalloc(>dev, sizeof(*mpc8xxx_gc), GFP_KERNEL);
@@ -315,14 +318,14 @@ static int mpc8xxx_probe(struct platform_device *pdev)
 
raw_spin_lock_init(_gc->lock);
 
-   mpc8xxx_gc->regs = of_iomap(np, 0);
-   if (!mpc8xxx_gc->regs)
-   return -ENOMEM;
+   mpc8xxx_gc->regs = devm_platform_ioremap_resource(pdev, 0);
+   if (IS_ERR(mpc8xxx_gc->regs))
+   return PTR_ERR(mpc8xxx_gc->regs);
 
gc = _gc->gc;
gc->parent = >dev;
 
-   if (of_property_read_bool(np, "little-endian")) {
+   if (device_property_read_bool(>dev, "little-endian")) {
ret = bgpio_init(gc, >dev, 4,
 mpc8xxx_gc->regs + GPIO_DAT,
 NULL, NULL,
@@ -345,6 +348,7 @@ static int mpc8xxx_probe(struct platform_device *pdev)
 
mpc8xxx_gc->direction_output = gc->direction_output;
 
+   devtype = device_get_match_data(>dev);
if (!devtype)
devtype = _gpio_devtype_default;
 
@@ -369,24 +373,29 @@ static int mpc8xxx_probe(struct platform_device *pdev)
 * associated input enable must be set (GPIOxGPIE[IEn]=1) to propagate
 * the port value to the GPIO Data Register.
 */
+   fwnode = dev_fwnode(>dev);
if (of_device_is_compatible(np, "fsl,qoriq-gpio") ||
of_device_is_compatible(np, "fsl,ls1028a-gpio") ||
-   of_device_is_compatible(np, "fsl,ls1088a-gpio"))
+   of_device_is_compatible(np, "fsl,ls1088a-gpio") ||
+   is_acpi_node(fwnode))
gc->write_reg(mpc8xxx_gc->regs + GPIO_IBE, 0x);
 
ret = gpiochip_add_data(gc, mpc8xxx_gc);
if (ret) {
-   pr_err("%pOF: GPIO chip registration failed with status %d\n",
-  np, ret);
+   dev_err(>dev,
+   "GPIO chip registration failed with status %d\n", ret);
goto err;
}
 
-   mpc8xxx_gc->irqn = irq_of_parse_and_map(np, 0);
+   mpc8xxx_gc->irqn = platform_get_irq(pdev, 0);
if (!mpc8xxx_gc->irqn)
return 0;
 
-   mpc8xxx_gc->irq = irq_domain_add_linear(np, MPC8XXX_GPIO_PINS,
-   _gpio_irq_ops, mpc8xxx_gc);
+   mpc8xxx_gc->irq = irq_domain_create_linear(fwnode,
+  MPC8XXX_GPIO_PINS,
+  _gpio_irq_ops,
+  mpc8xxx_gc);
+
if (!mpc8xxx_gc->irq)
return 0;
 
@@ -399,8 +408,9 @@ static int mpc8xxx_probe(struct platform_device *pdev)
   IRQF_SHARED, "gpio-cascade",
   mpc8xxx_gc);
if (ret) {
-   dev_err(>dev, "%s: failed to devm_request_irq(%d), ret = 
%d\n",
-   np->full_name, mpc8xxx_gc->irqn, ret);
+   dev_err(>dev,
+   "failed to devm_request_irq(%d), ret = %d\n",
+   mpc8xxx_gc->irqn, ret);
goto err;
}
 
@@ -425,12 +435,21 @@ static int mpc8xxx_remove(struct platform_device *pdev)
return 0;
 }
 
+#ifdef CONFIG_ACPI
+static const struct 

Re: [PATCH] Input: Fix a typo

2021-03-21 Thread Dmitry Torokhov
On Mon, Mar 22, 2021 at 07:50:30AM +0530, Bhaskar Chowdhury wrote:
> 
> s/subsytem/subsystem/
> 
> Signed-off-by: Bhaskar Chowdhury 

Applied, thank you.

-- 
Dmitry


[PATCH] docs: submitting-patches Fix a typo

2021-03-21 Thread Bhaskar Chowdhury


s/mesages/messages/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/process/submitting-patches.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/submitting-patches.rst 
b/Documentation/process/submitting-patches.rst
index 91de63b201c1..8b2676527b7e 100644
--- a/Documentation/process/submitting-patches.rst
+++ b/Documentation/process/submitting-patches.rst
@@ -679,7 +679,7 @@ generates appropriate diffstats by default.)
 See more details on the proper patch format in the following
 references.

-Backtraces in commit mesages
+Backtraces in commit messages
 

 Backtraces help document the call chain leading to a problem. However,
--
2.31.0



Re: [PATCH v3 2/3] net: ethernet: actions: Add Actions Semi Owl Ethernet MAC driver

2021-03-21 Thread Florian Fainelli
Hi Christian,

On 3/21/2021 4:29 PM, Cristian Ciocaltea wrote:
> Add new driver for the Ethernet MAC used on the Actions Semi Owl
> family of SoCs.
> 
> Currently this has been tested only on the Actions Semi S500 SoC
> variant.
> 
> Signed-off-by: Cristian Ciocaltea 

[snip]

Do you know the story behind this Ethernet controller? The various
receive/transmit descriptor definitions are 99% those defined in
drivers/net/ethernet/stmmicro/stmmac/descs.h for the normal descriptor.

The register layout of the MAC looks completely different from a
dwmac100 or dwmac1000 however.
-- 
Florian


Re: [PATCH] x86/entry: Fix a typo

2021-03-21 Thread Ingo Molnar


* Bhaskar Chowdhury  wrote:

> On 23:55 Sun 21 Mar 2021, Ingo Molnar wrote:
> > 
> > * Randy Dunlap  wrote:
> > 
> > > 
> > > 
> > > On Mon, 22 Mar 2021, Bhaskar Chowdhury wrote:
> > > 
> > > >
> > > > s/swishes/switch/
> > > 
> > > should be 'switches'
> > 
> > Correct - this patch exchanged a typo for a grammar mistake...
> > 
> Sent a V2 ...check out..

I cannot find it in my mbox or on lkml - but in any case, this should 
be fixed in tip:master too.

Thanks,

Ingo


Re: [PATCH] thermal/drivers/cpuidle_cooling: Fix use after error

2021-03-21 Thread Viresh Kumar
On 19-03-21, 21:25, Daniel Lezcano wrote:
> When the function successfully finishes it logs an information about
> the registration of the cooling device and use its name to build the
> message. Unfortunately it was freed right before:
> 
> drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register()
>   warn: 'name' was already freed.
> 
> Fix this by freeing after the message happened.
> 
> Fixes: 6fd1b186d900 ("thermal/drivers/cpuidle_cooling: Use device name 
> instead of auto-numbering")

Why not merge this with the Fixes patch itself since it isn't there in Linus's
tree yet ?

Or is your branch strictly immutable ?

-- 
viresh


[tip: irq/core] irq: Fix typos in comments

2021-03-21 Thread tip-bot2 for Ingo Molnar
The following commit has been merged into the irq/core branch of tip:

Commit-ID: a359f757965aafd0f58570de95dc6bc06cf12a9c
Gitweb:
https://git.kernel.org/tip/a359f757965aafd0f58570de95dc6bc06cf12a9c
Author:Ingo Molnar 
AuthorDate:Mon, 22 Mar 2021 04:21:30 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 22 Mar 2021 04:23:14 +01:00

irq: Fix typos in comments

Fix ~36 single-word typos in the IRQ, irqchip and irqdomain code comments.

Signed-off-by: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Marc Zyngier 
Cc: Borislav Petkov 
Cc: Peter Zijlstra 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 drivers/irqchip/irq-aspeed-vic.c   |  4 ++--
 drivers/irqchip/irq-bcm7120-l2.c   |  2 +-
 drivers/irqchip/irq-csky-apb-intc.c|  2 +-
 drivers/irqchip/irq-gic-v2m.c  |  2 +-
 drivers/irqchip/irq-gic-v3-its.c   | 10 +-
 drivers/irqchip/irq-gic-v3.c   |  2 +-
 drivers/irqchip/irq-loongson-pch-pic.c |  2 +-
 drivers/irqchip/irq-meson-gpio.c   |  2 +-
 drivers/irqchip/irq-mtk-cirq.c |  2 +-
 drivers/irqchip/irq-mxs.c  |  4 ++--
 drivers/irqchip/irq-sun4i.c|  2 +-
 drivers/irqchip/irq-ti-sci-inta.c  |  2 +-
 drivers/irqchip/irq-vic.c  |  4 ++--
 drivers/irqchip/irq-xilinx-intc.c  |  2 +-
 include/linux/irq.h|  4 ++--
 include/linux/irqdesc.h|  2 +-
 kernel/irq/chip.c  |  2 +-
 kernel/irq/dummychip.c |  2 +-
 kernel/irq/irqdesc.c   |  2 +-
 kernel/irq/irqdomain.c |  8 
 kernel/irq/manage.c|  6 +++---
 kernel/irq/msi.c   |  2 +-
 kernel/irq/timings.c   |  2 +-
 23 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/drivers/irqchip/irq-aspeed-vic.c b/drivers/irqchip/irq-aspeed-vic.c
index 6567ed7..58717cd 100644
--- a/drivers/irqchip/irq-aspeed-vic.c
+++ b/drivers/irqchip/irq-aspeed-vic.c
@@ -71,7 +71,7 @@ static void vic_init_hw(struct aspeed_vic *vic)
writel(0, vic->base + AVIC_INT_SELECT);
writel(0, vic->base + AVIC_INT_SELECT + 4);
 
-   /* Some interrupts have a programable high/low level trigger
+   /* Some interrupts have a programmable high/low level trigger
 * (4 GPIO direct inputs), for now we assume this was configured
 * by firmware. We read which ones are edge now.
 */
@@ -203,7 +203,7 @@ static int __init avic_of_init(struct device_node *node,
}
vic->base = regs;
 
-   /* Initialize soures, all masked */
+   /* Initialize sources, all masked */
vic_init_hw(vic);
 
/* Ready to receive interrupts */
diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index c7c9e97..ad59656 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -309,7 +309,7 @@ static int __init bcm7120_l2_intc_probe(struct device_node 
*dn,
 
if (data->can_wake) {
/* This IRQ chip can wake the system, set all
-* relevant child interupts in wake_enabled mask
+* relevant child interrupts in wake_enabled mask
 */
gc->wake_enabled = 0x;
gc->wake_enabled &= ~gc->unused;
diff --git a/drivers/irqchip/irq-csky-apb-intc.c 
b/drivers/irqchip/irq-csky-apb-intc.c
index 5a2ec43..ab91afa 100644
--- a/drivers/irqchip/irq-csky-apb-intc.c
+++ b/drivers/irqchip/irq-csky-apb-intc.c
@@ -176,7 +176,7 @@ gx_intc_init(struct device_node *node, struct device_node 
*parent)
writel(0x0, reg_base + GX_INTC_NEN63_32);
 
/*
-* Initial mask reg with all unmasked, because we only use enalbe reg
+* Initial mask reg with all unmasked, because we only use enable reg
 */
writel(0x0, reg_base + GX_INTC_NMASK31_00);
writel(0x0, reg_base + GX_INTC_NMASK63_32);
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index fbec07d..4116b48 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -371,7 +371,7 @@ static int __init gicv2m_init_one(struct fwnode_handle 
*fwnode,
 * the MSI data is the absolute value within the range from
 * spi_start to (spi_start + num_spis).
 *
-* Broadom NS2 GICv2m implementation has an erratum where the MSI data
+* Broadcom NS2 GICv2m implementation has an erratum where the MSI data
 * is 'spi_number - 32'
 *
 * Reading that register fails on the Graviton implementation
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ed46e60..c3485b2 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1492,7 +1492,7 @@ static void its_vlpi_set_doorbell(struct irq_data *d, 
bool enable)
 *
 * 

[tip: core/entry] entry: Fix typos in comments

2021-03-21 Thread tip-bot2 for Ingo Molnar
The following commit has been merged into the core/entry branch of tip:

Commit-ID: 97258ce902d1e1c396a4d7c38f6ae7085adb73c5
Gitweb:
https://git.kernel.org/tip/97258ce902d1e1c396a4d7c38f6ae7085adb73c5
Author:Ingo Molnar 
AuthorDate:Mon, 22 Mar 2021 03:55:50 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 22 Mar 2021 03:57:39 +01:00

entry: Fix typos in comments

Fix 3 single-word typos in the generic syscall entry code.

Signed-off-by: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Peter Zijlstra 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 include/linux/entry-common.h | 4 ++--
 kernel/entry/common.c| 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 883acef..2e2b8d6 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -360,7 +360,7 @@ void syscall_exit_to_user_mode_work(struct pt_regs *regs);
  *
  * This is a combination of syscall_exit_to_user_mode_work() (1,2) and
  * exit_to_user_mode(). This function is preferred unless there is a
- * compelling architectural reason to use the seperate functions.
+ * compelling architectural reason to use the separate functions.
  */
 void syscall_exit_to_user_mode(struct pt_regs *regs);
 
@@ -381,7 +381,7 @@ void irqentry_enter_from_user_mode(struct pt_regs *regs);
  * irqentry_exit_to_user_mode - Interrupt exit work
  * @regs:  Pointer to current's pt_regs
  *
- * Invoked with interrupts disbled and fully valid regs. Returns with all
+ * Invoked with interrupts disabled and fully valid regs. Returns with all
  * work handled, interrupts disabled such that the caller can immediately
  * switch to user mode. Called from architecture specific interrupt
  * handling code.
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 8442e5c..8d996dd 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -341,7 +341,7 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs 
*regs)
 * Checking for rcu_is_watching() here would prevent the nesting
 * interrupt to invoke rcu_irq_enter(). If that nested interrupt is
 * the tick then rcu_flavor_sched_clock_irq() would wrongfully
-* assume that it is the first interupt and eventually claim
+* assume that it is the first interrupt and eventually claim
 * quiescent state and end grace periods prematurely.
 *
 * Unconditionally invoke rcu_irq_enter() so RCU state stays


Re: [PATCH v5 00/27] Memory Folios

2021-03-21 Thread Matthew Wilcox
On Sat, Mar 20, 2021 at 05:40:37AM +, Matthew Wilcox (Oracle) wrote:
> Current tree at:
> https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio
> 
> (contains another ~100 patches on top of this batch, not all of which are
> in good shape for submission)

I've fixed the two buildbot bugs.  I also resplit the docs work, and
did a bunch of other things to the patches that I haven't posted yet.

I'll send the first three patches as a separate series tomorrow,
and then the next four as their own series, then I'll repost the
rest (up to and including "Convert page wait queues to be folios")
later in the week.


Re: [PATCH V6 4/4] cpufreq: CPPC: Add support for frequency invariance

2021-03-21 Thread Viresh Kumar
On 19-03-21, 17:20, Rafael J. Wysocki wrote:
> Sorry for the delay.
> 
> Acked-by: Rafael J. Wysocki 

Thanks.

> and I'm assuming that either you or the sched guys will take care of it.

Yeah, I have already queued this up.

-- 
viresh


Re: [PATCH 1/3] virtio_ring: always warn when descriptor chain exceeds queue size

2021-03-21 Thread Jason Wang



在 2021/3/18 下午9:52, Connor Kuehl 写道:

 From section 2.6.5.3.1 (Driver Requirements: Indirect Descriptors)
of the virtio spec:

   "A driver MUST NOT create a descriptor chain longer than the Queue
   Size of the device."

This text suggests that the warning should trigger even if
indirect descriptors are in use.



So I think at least the commit log needs some tweak.

For split virtqueue. We had:

2.6.5.2 Driver Requirements: The Virtqueue Descriptor Table

Drivers MUST NOT add a descriptor chain longer than 2^32 bytes in total; 
this implies that loops in the descriptor chain are forbidden!


2.6.5.3.1 Driver Requirements: Indirect Descriptors

A driver MUST NOT create a descriptor chain longer than the Queue Size 
of the device.


If I understand the spec correctly, the check is only needed for a 
single indirect descriptor table?


For packed virtqueue. We had:

2.7.17 Driver Requirements: Scatter-Gather Support

A driver MUST NOT create a descriptor list longer than allowed by the 
device.


A driver MUST NOT create a descriptor list longer than the Queue Size.

2.7.19 Driver Requirements: Indirect Descriptors

A driver MUST NOT create a descriptor chain longer than allowed by the 
device.


So it looks to me the packed part is fine.

Note that if I understand the spec correctly 2.7.17 implies 2.7.19.

Thanks




Reported-by: Stefan Hajnoczi 
Signed-off-by: Connor Kuehl 
---
  drivers/virtio/virtio_ring.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 71e16b53e9c1..1bc290f9ba13 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -444,11 +444,12 @@ static inline int virtqueue_add_split(struct virtqueue 
*_vq,
  
  	head = vq->free_head;
  
+	WARN_ON_ONCE(total_sg > vq->split.vring.num);

+
if (virtqueue_use_indirect(_vq, total_sg))
desc = alloc_indirect_split(_vq, total_sg, gfp);
else {
desc = NULL;
-   WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect);
}
  
  	if (desc) {

@@ -1118,6 +1119,8 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
  
  	BUG_ON(total_sg == 0);
  
+	WARN_ON_ONCE(total_sg > vq->packed.vring.num);

+
if (virtqueue_use_indirect(_vq, total_sg))
return virtqueue_add_indirect_packed(vq, sgs, total_sg,
out_sgs, in_sgs, data, gfp);
@@ -1125,8 +1128,6 @@ static inline int virtqueue_add_packed(struct virtqueue 
*_vq,
head = vq->packed.next_avail_idx;
avail_used_flags = vq->packed.avail_used_flags;
  
-	WARN_ON_ONCE(total_sg > vq->packed.vring.num && !vq->indirect);

-
desc = vq->packed.vring.desc;
i = head;
descs_used = total_sg;




[PATCH] scsi: mpt3sas: Fix a typo

2021-03-21 Thread Bhaskar Chowdhury


s/encloure/enclosure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index ac066f86bb14..398fd07ee9f5 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -5232,7 +5232,7 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc)
  * mpt3sas_free_enclosure_list - release memory
  * @ioc: per adapter object
  *
- * Free memory allocated during encloure add.
+ * Free memory allocated during enclosure add.
  */
 void
 mpt3sas_free_enclosure_list(struct MPT3SAS_ADAPTER *ioc)
--
2.31.0



[PATCH -next] x86: Fix unused variable 'msr_val' warning

2021-03-21 Thread Xu Yihang
Fixes the following W=1 kernel build warning(s):
arch/x86/hyperv/hv_spinlock.c:28:16: warning: variable ‘msr_val’ set but not 
used [-Wunused-but-set-variable]
  unsigned long msr_val;

As Hypervisor Top-Level Functional Specification states in chapter 7.5 Virtual 
Processor Idle Sleep State, "A partition which possesses the AccessGuestIdleMsr 
privilege (refer to section 4.2.2) may trigger entry into the virtual processor 
idle sleep state through a read to the hypervisor-defined MSR 
HV_X64_MSR_GUEST_IDLE". That means only a read is necessary, msr_val is not 
uesed, so __maybe_unused should be added.

Reference:
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs

Reported-by: Hulk Robot 
Signed-off-by: Xu Yihang 
---
 arch/x86/hyperv/hv_spinlock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index f3270c1fc48c..67bc15c7752a 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -25,7 +25,7 @@ static void hv_qlock_kick(int cpu)
 
 static void hv_qlock_wait(u8 *byte, u8 val)
 {
-   unsigned long msr_val;
+   unsigned long msr_val __maybe_unused;
unsigned long flags;
 
if (in_nmi())
-- 
2.17.1



[PATCH 12/13] usb: mtu3: drop CONFIG_OF

2021-03-21 Thread Chunfeng Yun
The driver can match only the devices created by the OF core
via the DT table, so the table should be always used.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/mtu3/mtu3_plat.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/usb/mtu3/mtu3_plat.c b/drivers/usb/mtu3/mtu3_plat.c
index d44d5417438d..7786a95a874e 100644
--- a/drivers/usb/mtu3/mtu3_plat.c
+++ b/drivers/usb/mtu3/mtu3_plat.c
@@ -502,25 +502,20 @@ static const struct dev_pm_ops mtu3_pm_ops = {
 
 #define DEV_PM_OPS (IS_ENABLED(CONFIG_PM) ? _pm_ops : NULL)
 
-#ifdef CONFIG_OF
-
 static const struct of_device_id mtu3_of_match[] = {
{.compatible = "mediatek,mt8173-mtu3",},
{.compatible = "mediatek,mtu3",},
{},
 };
-
 MODULE_DEVICE_TABLE(of, mtu3_of_match);
 
-#endif
-
 static struct platform_driver mtu3_driver = {
.probe = mtu3_probe,
.remove = mtu3_remove,
.driver = {
.name = MTU3_DRIVER_NAME,
.pm = DEV_PM_OPS,
-   .of_match_table = of_match_ptr(mtu3_of_match),
+   .of_match_table = mtu3_of_match,
},
 };
 module_platform_driver(mtu3_driver);
-- 
2.18.0



[PATCH 11/13] usb: mtu3: add support ip-sleep wakeup for MT8192

2021-03-21 Thread Chunfeng Yun
Add add support ip-sleep wakeup for MT8192, it's a specific
revision, not follow IP rule.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/mtu3/mtu3_host.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/mtu3/mtu3_host.c b/drivers/usb/mtu3/mtu3_host.c
index e35b17e5f58e..601656f436a1 100644
--- a/drivers/usb/mtu3/mtu3_host.c
+++ b/drivers/usb/mtu3/mtu3_host.c
@@ -30,6 +30,10 @@
 #define WC0_IS_P   BIT(12) /* polarity */
 #define WC0_IS_EN  BIT(6)
 
+/* mt8192 */
+#define WC0_SSUSB0_CDENBIT(6)
+#define WC0_IS_SPM_EN  BIT(1)
+
 /* mt2712 etc */
 #define PERI_SSUSB_SPM_CTRL0x0
 #define SSC_IP_SLEEP_ENBIT(4)
@@ -39,6 +43,7 @@ enum ssusb_uwk_vers {
SSUSB_UWK_V1 = 1,
SSUSB_UWK_V2,
SSUSB_UWK_V11 = 11, /* specific revision 1.1 */
+   SSUSB_UWK_V12,  /* specific revision 1.2 */
 };
 
 /*
@@ -60,6 +65,11 @@ static void ssusb_wakeup_ip_sleep_set(struct ssusb_mtk 
*ssusb, bool enable)
msk = WC0_IS_EN | WC0_IS_C(0xf) | WC0_IS_P;
val = enable ? (WC0_IS_EN | WC0_IS_C(0x8)) : 0;
break;
+   case SSUSB_UWK_V12:
+   reg = ssusb->uwk_reg_base + PERI_WK_CTRL0;
+   msk = WC0_SSUSB0_CDEN | WC0_IS_SPM_EN;
+   val = enable ? msk : 0;
+   break;
case SSUSB_UWK_V2:
reg = ssusb->uwk_reg_base + PERI_SSUSB_SPM_CTRL;
msk = SSC_IP_SLEEP_EN | SSC_SPM_INT_EN;
-- 
2.18.0



[PATCH 13/13] arm64: dts: mt8183: update wakeup register offset

2021-03-21 Thread Chunfeng Yun
Use wakeup control register offset exactly, and update revision
number

Signed-off-by: Chunfeng Yun 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 80519a145f13..9d18a938150c 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -874,7 +874,7 @@
clocks = < CLK_INFRA_UNIPRO_SCK>,
 < CLK_INFRA_USB>;
clock-names = "sys_ck", "ref_ck";
-   mediatek,syscon-wakeup = < 0x400 0>;
+   mediatek,syscon-wakeup = < 0x420 11>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
-- 
2.18.0



[PATCH 06/13] usb: xhci-mtk: support ip-sleep wakeup for MT8183

2021-03-21 Thread Chunfeng Yun
Add support ip-sleep wakeup for MT8183, it's similar to MT8173,
and it's also a specific one, but not follow IPM rule.
Due to the index 2 already used by many DTS, it's better to keep
it unchanged for backward compatible, treat specific ones without
following IPM rule as revision 1.x, meanwhile reserve 3~10 for
later revision that follows the IPM rule.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 09f2ddbfe8b9..8ba1f914cb75 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -57,12 +57,19 @@
 #define CTRL_U2_FORCE_PLL_STB  BIT(28)
 
 /* usb remote wakeup registers in syscon */
+
 /* mt8173 etc */
 #define PERI_WK_CTRL1  0x4
 #define WC1_IS_C(x)(((x) & 0xf) << 26)  /* cycle debounce */
 #define WC1_IS_EN  BIT(25)
 #define WC1_IS_P   BIT(6)  /* polarity for ip sleep */
 
+/* mt8183 */
+#define PERI_WK_CTRL0  0x0
+#define WC0_IS_C(x)(((x) & 0xf) << 28)  /* cycle debounce */
+#define WC0_IS_P   BIT(12) /* polarity */
+#define WC0_IS_EN  BIT(6)
+
 /* mt2712 etc */
 #define PERI_SSUSB_SPM_CTRL0x0
 #define SSC_IP_SLEEP_ENBIT(4)
@@ -71,6 +78,7 @@
 enum ssusb_uwk_vers {
SSUSB_UWK_V1 = 1,
SSUSB_UWK_V2,
+   SSUSB_UWK_V11 = 11, /* specific revision 1.1 */
 };
 
 static int xhci_mtk_host_enable(struct xhci_hcd_mtk *mtk)
@@ -300,6 +308,11 @@ static void usb_wakeup_ip_sleep_set(struct xhci_hcd_mtk 
*mtk, bool enable)
msk = WC1_IS_EN | WC1_IS_C(0xf) | WC1_IS_P;
val = enable ? (WC1_IS_EN | WC1_IS_C(0x8)) : 0;
break;
+   case SSUSB_UWK_V11:
+   reg = mtk->uwk_reg_base + PERI_WK_CTRL0;
+   msk = WC0_IS_EN | WC0_IS_C(0xf) | WC0_IS_P;
+   val = enable ? (WC0_IS_EN | WC0_IS_C(0x8)) : 0;
+   break;
case SSUSB_UWK_V2:
reg = mtk->uwk_reg_base + PERI_SSUSB_SPM_CTRL;
msk = SSC_IP_SLEEP_EN | SSC_SPM_INT_EN;
-- 
2.18.0



[PATCH 09/13] usb: xhci-mtk: remove MODULE_ALIAS

2021-03-21 Thread Chunfeng Yun
Since the driver only supports the devices created by the OF
core, seems no need MODULE_ALIAS() anymore.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 7b49064ae5d4..888666c9d7bc 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -705,7 +705,6 @@ static struct platform_driver mtk_xhci_driver = {
.of_match_table = mtk_xhci_of_match,
},
 };
-MODULE_ALIAS("platform:xhci-mtk");
 
 static int __init xhci_mtk_init(void)
 {
-- 
2.18.0



[PATCH 02/13] dt-bindings: usb: mtk-xhci: add support wakeup for mt8183 and mt8192

2021-03-21 Thread Chunfeng Yun
These two HW of wakeup don't follow MediaTek internal IPM rule,
both use a specific way, like as early revision of mt8173.

Due to the index 2 already used by many DTS, it's better to keep
it unchanged for backward compatible, treat specific ones without
following IPM rule as revision 1.x, meanwhile reserve 3~10 for
later revisions with following the IPM rule.

Signed-off-by: Chunfeng Yun 
---
 .../devicetree/bindings/usb/mediatek,mtk-xhci.yaml | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml 
b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
index 2246d29a5e4e..f5dff7fb5755 100644
--- a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
+++ b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
@@ -30,6 +30,7 @@ properties:
   - mediatek,mt7629-xhci
   - mediatek,mt8173-xhci
   - mediatek,mt8183-xhci
+  - mediatek,mt8192-xhci
   - const: mediatek,mtk-xhci
 
   reg:
@@ -131,10 +132,13 @@ properties:
 - description:
 The second cell represents the register base address of the glue
 layer in syscon
-- description:
+- description: |
 The third cell represents the hardware version of the glue layer,
-1 is used by mt8173 etc, 2 is used by mt2712 etc
-  enum: [1, 2]
+1 - used by mt8173 etc, revision 1 without following IPM rule;
+2 - used by mt2712 etc, revision 2 following IPM rule;
+11 - used by mt8183, specific 1.1;
+12 - used by mt8192, specific 1.2;
+  enum: [1, 2, 11, 12]
 
   mediatek,u3p-dis-msk:
 $ref: /schemas/types.yaml#/definitions/uint32
-- 
2.18.0



[PATCH 08/13] usb: xhci-mtk: drop CONFIG_OF

2021-03-21 Thread Chunfeng Yun
The driver can match only the devices created by the OF core
via the DT table, so the table should be always used.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 1bfa28c9b5a2..7b49064ae5d4 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -689,14 +689,12 @@ static const struct dev_pm_ops xhci_mtk_pm_ops = {
 };
 #define DEV_PM_OPS IS_ENABLED(CONFIG_PM) ? _mtk_pm_ops : NULL
 
-#ifdef CONFIG_OF
 static const struct of_device_id mtk_xhci_of_match[] = {
{ .compatible = "mediatek,mt8173-xhci"},
{ .compatible = "mediatek,mtk-xhci"},
{ },
 };
 MODULE_DEVICE_TABLE(of, mtk_xhci_of_match);
-#endif
 
 static struct platform_driver mtk_xhci_driver = {
.probe  = xhci_mtk_probe,
@@ -704,7 +702,7 @@ static struct platform_driver mtk_xhci_driver = {
.driver = {
.name = "xhci-mtk",
.pm = DEV_PM_OPS,
-   .of_match_table = of_match_ptr(mtk_xhci_of_match),
+   .of_match_table = mtk_xhci_of_match,
},
 };
 MODULE_ALIAS("platform:xhci-mtk");
-- 
2.18.0



[PATCH 10/13] usb: mtu3: support ip-sleep wakeup for MT8183

2021-03-21 Thread Chunfeng Yun
Add support ip-sleep wakeup for MT8183, it's similar to MT8173,
and it's also a specific one, but not follow IPM rule.
Due to the index 2 already used by many DTS, it's better to keep
it unchanged for backward compatible, treat specific ones without
following IPM rule as revision 1.x, meanwhile reserve 3~10 for later
revision that follows the IPM rule.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/mtu3/mtu3_host.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/usb/mtu3/mtu3_host.c b/drivers/usb/mtu3/mtu3_host.c
index c871b94f3e6f..e35b17e5f58e 100644
--- a/drivers/usb/mtu3/mtu3_host.c
+++ b/drivers/usb/mtu3/mtu3_host.c
@@ -24,6 +24,12 @@
 #define WC1_IS_EN  BIT(25)
 #define WC1_IS_P   BIT(6)  /* polarity for ip sleep */
 
+/* mt8183 */
+#define PERI_WK_CTRL0  0x0
+#define WC0_IS_C(x)(((x) & 0xf) << 28)  /* cycle debounce */
+#define WC0_IS_P   BIT(12) /* polarity */
+#define WC0_IS_EN  BIT(6)
+
 /* mt2712 etc */
 #define PERI_SSUSB_SPM_CTRL0x0
 #define SSC_IP_SLEEP_ENBIT(4)
@@ -32,6 +38,7 @@
 enum ssusb_uwk_vers {
SSUSB_UWK_V1 = 1,
SSUSB_UWK_V2,
+   SSUSB_UWK_V11 = 11, /* specific revision 1.1 */
 };
 
 /*
@@ -48,6 +55,11 @@ static void ssusb_wakeup_ip_sleep_set(struct ssusb_mtk 
*ssusb, bool enable)
msk = WC1_IS_EN | WC1_IS_C(0xf) | WC1_IS_P;
val = enable ? (WC1_IS_EN | WC1_IS_C(0x8)) : 0;
break;
+   case SSUSB_UWK_V11:
+   reg = ssusb->uwk_reg_base + PERI_WK_CTRL0;
+   msk = WC0_IS_EN | WC0_IS_C(0xf) | WC0_IS_P;
+   val = enable ? (WC0_IS_EN | WC0_IS_C(0x8)) : 0;
+   break;
case SSUSB_UWK_V2:
reg = ssusb->uwk_reg_base + PERI_SSUSB_SPM_CTRL;
msk = SSC_IP_SLEEP_EN | SSC_SPM_INT_EN;
-- 
2.18.0



[PATCH 05/13] usb: xhci-mtk: support quirk to disable usb2 lpm

2021-03-21 Thread Chunfeng Yun
The xHCI driver support usb2 HW LPM by default, here add support
XHCI_HW_LPM_DISABLE quirk, then we can disable usb2 lpm when
need it.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 3 +++
 drivers/usb/host/xhci-mtk.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 1b9f10048fe0..09f2ddbfe8b9 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -388,6 +388,8 @@ static void xhci_mtk_quirks(struct device *dev, struct 
xhci_hcd *xhci)
xhci->quirks |= XHCI_SPURIOUS_SUCCESS;
if (mtk->lpm_support)
xhci->quirks |= XHCI_LPM_SUPPORT;
+   if (mtk->u2_lpm_disable)
+   xhci->quirks |= XHCI_HW_LPM_DISABLE;
 
/*
 * MTK xHCI 0.96: PSA is 1 by default even if doesn't support stream,
@@ -470,6 +472,7 @@ static int xhci_mtk_probe(struct platform_device *pdev)
return ret;
 
mtk->lpm_support = of_property_read_bool(node, "usb3-lpm-capable");
+   mtk->u2_lpm_disable = of_property_read_bool(node, "usb2-lpm-disable");
/* optional property, ignore the error if it does not exist */
of_property_read_u32(node, "mediatek,u3p-dis-msk",
 >u3p_dis_msk);
diff --git a/drivers/usb/host/xhci-mtk.h b/drivers/usb/host/xhci-mtk.h
index 621ec1a85009..4ccd08e20a15 100644
--- a/drivers/usb/host/xhci-mtk.h
+++ b/drivers/usb/host/xhci-mtk.h
@@ -149,6 +149,7 @@ struct xhci_hcd_mtk {
struct phy **phys;
int num_phys;
bool lpm_support;
+   bool u2_lpm_disable;
/* usb remote wakeup */
bool uwk_en;
struct regmap *uwk;
-- 
2.18.0



[PATCH 07/13] usb: xhci-mtk: add support ip-sleep wakeup for mT8192

2021-03-21 Thread Chunfeng Yun
Add support ip-sleep wakeup for mT8192, it's a specific revision,
and not follow IPM rule.

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 8ba1f914cb75..1bfa28c9b5a2 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -70,6 +70,10 @@
 #define WC0_IS_P   BIT(12) /* polarity */
 #define WC0_IS_EN  BIT(6)
 
+/* mt8192 */
+#define WC0_SSUSB0_CDENBIT(6)
+#define WC0_IS_SPM_EN  BIT(1)
+
 /* mt2712 etc */
 #define PERI_SSUSB_SPM_CTRL0x0
 #define SSC_IP_SLEEP_ENBIT(4)
@@ -79,6 +83,7 @@ enum ssusb_uwk_vers {
SSUSB_UWK_V1 = 1,
SSUSB_UWK_V2,
SSUSB_UWK_V11 = 11, /* specific revision 1.1 */
+   SSUSB_UWK_V12,  /* specific revision 1.2 */
 };
 
 static int xhci_mtk_host_enable(struct xhci_hcd_mtk *mtk)
@@ -313,6 +318,11 @@ static void usb_wakeup_ip_sleep_set(struct xhci_hcd_mtk 
*mtk, bool enable)
msk = WC0_IS_EN | WC0_IS_C(0xf) | WC0_IS_P;
val = enable ? (WC0_IS_EN | WC0_IS_C(0x8)) : 0;
break;
+   case SSUSB_UWK_V12:
+   reg = mtk->uwk_reg_base + PERI_WK_CTRL0;
+   msk = WC0_SSUSB0_CDEN | WC0_IS_SPM_EN;
+   val = enable ? msk : 0;
+   break;
case SSUSB_UWK_V2:
reg = mtk->uwk_reg_base + PERI_SSUSB_SPM_CTRL;
msk = SSC_IP_SLEEP_EN | SSC_SPM_INT_EN;
-- 
2.18.0



[PATCH 04/13] usb: xhci-mtk: fix broken streams issue on 0.96 xHCI

2021-03-21 Thread Chunfeng Yun
The MediaTek 0.96 xHCI controller on some platforms does not
support bulk stream even HCCPARAMS says supporting, due to MaxPSASize
is set a default value 1 by mistake, here use XHCI_BROKEN_STREAMS
quirk to fix it.

Fixes: 94a631d91ad3 ("usb: xhci-mtk: check hcc_params after adding primary hcd")
Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 57bcfdfa0465..1b9f10048fe0 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -388,6 +388,13 @@ static void xhci_mtk_quirks(struct device *dev, struct 
xhci_hcd *xhci)
xhci->quirks |= XHCI_SPURIOUS_SUCCESS;
if (mtk->lpm_support)
xhci->quirks |= XHCI_LPM_SUPPORT;
+
+   /*
+* MTK xHCI 0.96: PSA is 1 by default even if doesn't support stream,
+* and it's 3 when support it.
+*/
+   if (xhci->hci_version < 0x100 && HCC_MAX_PSA(xhci->hcc_params) == 4)
+   xhci->quirks |= XHCI_BROKEN_STREAMS;
 }
 
 /* called during probe() after chip reset completes */
@@ -549,7 +556,8 @@ static int xhci_mtk_probe(struct platform_device *pdev)
if (ret)
goto put_usb3_hcd;
 
-   if (HCC_MAX_PSA(xhci->hcc_params) >= 4)
+   if (HCC_MAX_PSA(xhci->hcc_params) >= 4 &&
+   !(xhci->quirks & XHCI_BROKEN_STREAMS))
xhci->shared_hcd->can_do_streams = 1;
 
ret = usb_add_hcd(xhci->shared_hcd, irq, IRQF_SHARED);
-- 
2.18.0



[PATCH 03/13] dt-bindings: usb: mtu3: support wakeup for mt8183 and mt8192

2021-03-21 Thread Chunfeng Yun
These two HW of wakeup don't follow MediaTek internal IPM rule,
and both use a specific way, like as early revision of mt8173.

Due to the index 2 already used by many DTS, it's better to keep
it unchanged for backward compatible, treat specific ones without
following IPM rule as revision 1.x, meanwhile reserve 3~10 for
later revision that following the IPM rule.

Signed-off-by: Chunfeng Yun 
---
 .../devicetree/bindings/usb/mediatek,mtu3.yaml | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/mediatek,mtu3.yaml 
b/Documentation/devicetree/bindings/usb/mediatek,mtu3.yaml
index f5c04b9d2de9..918f9d6447c6 100644
--- a/Documentation/devicetree/bindings/usb/mediatek,mtu3.yaml
+++ b/Documentation/devicetree/bindings/usb/mediatek,mtu3.yaml
@@ -24,6 +24,7 @@ properties:
   - mediatek,mt2712-mtu3
   - mediatek,mt8173-mtu3
   - mediatek,mt8183-mtu3
+  - mediatek,mt8192-mtu3
   - const: mediatek,mtu3
 
   reg:
@@ -152,10 +153,13 @@ properties:
 - description:
 The second cell represents the register base address of the glue
 layer in syscon
-- description:
+- description: |
 The third cell represents the hardware version of the glue layer,
-1 is used by mt8173 etc, 2 is used by mt2712 etc
-  enum: [1, 2]
+1 - used by mt8173 etc, revision 1 without following IPM rule;
+2 - used by mt2712 etc, revision 2 with following IPM rule;
+11 - used by mt8183, specific 1.1;
+12 - used by mt8192, specific 1.2;
+  enum: [1, 2, 11, 12]
 
   mediatek,u3p-dis-msk:
 $ref: /schemas/types.yaml#/definitions/uint32
-- 
2.18.0



[PATCH 01/13] dt-bindings: usb: mtk-xhci: support property usb2-lpm-disable

2021-03-21 Thread Chunfeng Yun
Add support common property usb2-lpm-disable

Signed-off-by: Chunfeng Yun 
---
 Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml 
b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
index 14f40efb3b22..2246d29a5e4e 100644
--- a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
+++ b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
@@ -103,6 +103,10 @@ properties:
 description: supports USB3.0 LPM
 type: boolean
 
+  usb2-lpm-disable:
+description: disable USB2 HW LPM
+type: boolean
+
   imod-interval-ns:
 description:
   Interrupt moderation interval value, it is 8 times as much as that
-- 
2.18.0



Re: [PATCH v1 09/14] mm: multigenerational lru: mm_struct list

2021-03-21 Thread Huang, Ying
Yu Zhao  writes:

> On Wed, Mar 17, 2021 at 11:37:38AM +0800, Huang, Ying wrote:
>> Yu Zhao  writes:
>> 
>> > On Tue, Mar 16, 2021 at 02:44:31PM +0800, Huang, Ying wrote:
>> > The scanning overhead is only one of the two major problems of the
>> > current page reclaim. The other problem is the granularity of the
>> > active/inactive (sizes). We stopped using them in making job
>> > scheduling decision a long time ago. I know another large internet
>> > company adopted a similar approach as ours, and I'm wondering how
>> > everybody else is coping with the discrepancy from those counters.
>> 
>> From intuition, the scanning overhead of the full page table scanning
>> appears higher than that of the rmap scanning for a small portion of
>> system memory.  But form your words, you think the reality is the
>> reverse?  If others concern about the overhead too, finally, I think you
>> need to prove the overhead of the page table scanning isn't too higher,
>> or even lower with more data and theory.
>
> There is a misunderstanding here. I never said anything about full
> page table scanning. And this is not how it's done in this series
> either. I guess the misunderstanding has something to do with the cold
> memory tracking you are thinking about?

If my understanding were correct, from the following code path in your
patch 10/14,

age_active_anon
  age_lru_gens
try_walk_mm_list
  walk_mm_list
walk_mm

So, in kswapd(), the page tables of many processes may be scanned
fully.  If the number of processes that are active are high, the
overhead may be high too.

> This series uses page tables to discover page accesses when a system
> has run out of inactive pages. Under such a situation, the system is
> very likely to have a lot of page accesses, and using the rmap is
> likely to cost a lot more because its poor memory locality compared
> with page tables.

This is the theory.  Can you verify this with more data?  Including the
CPU cycles or time spent scanning page tables?

> But, page tables can be sparse too, in terms of hot memory tracking.
> Dave has asked me to test the worst case scenario, which I'll do.
> And I'd be happy to share more data. Any specific workload you are
> interested in?

We can start with some simple workloads that are easier to be reasoned.
For example,

1. Run the workload with hot and cold pages, when the free memory
becomes lower than the low watermark, kswapd will be waken up to scan
and reclaim some cold pages.  How long will it take to do that?  It's
expected that almost all pages need to be scanned, so that page table
scanning is expected to have less overhead.  We can measure how well it
is.

2. Run the workload with hot and cold pages, if the whole working-set
cannot fit in DRAM, that is, the cold pages will be reclaimed and
swapped in regularly (for example tens MB/s).  It's expected that less
pages may be scanned with rmap, but the speed of page table scanning is
faster.

3. Run the workload with hot and cold pages, the system is
overcommitted, that is, some cold pages will be placed in swap.  But the
cold pages are cold enough, so there's almost no thrashing.  Then the
hot working-set of the workload changes, that is, some hot pages become
cold, while some cold pages becomes hot, so page reclaiming and swapin
will be triggered.

For each cases, we can use some different parameters.  And we can
measure something like the number of pages scanned, the time taken to
scan them, the number of page reclaimed and swapped in, etc.

Best Regards,
Huang, Ying


Re: [ANNOUNCE] v5.12-rc3-rt3

2021-03-21 Thread Mike Galbraith
On Sun, 2021-03-21 at 08:46 +0100, Mike Galbraith wrote:
> On Sat, 2021-03-20 at 09:18 +0100, Mike Galbraith wrote:
> > On Fri, 2021-03-19 at 23:33 +0100, Sebastian Andrzej Siewior wrote:
> > > Dear RT folks!
> > >
> > > I'm pleased to announce the v5.12-rc3-rt3 patch set.
> >
> > My little rpi4b is fairly unhappy with 5.12-rt, whereas 5.11-rt works
> > fine on it.  The below spew is endless, making boot endless.  I turned
> > it into a WARN_ON_ONCE to see if the thing would finish boot, and
> > surprisingly, it seems perfectly fine with that bad idea. Having not
> > the foggiest clue what I'm doing down in arm arch-land, bug is in no
> > immediate danger :)
>
> Actually, it looks like a defenseless little buglet, and this gripe
> simply wants to be disabled for RT.

Or completely removed instead.

It's entirely possible I'm missing something obvious to arm experts,
but I don't _think_ the register read needs protection, leaving me
wondering why arch_faults_on_old_pte() was born with that warning.

-Mike



[PATCH] drm/imx: imx-ldb: Register LDB channel1 when it is the only channel to be used

2021-03-21 Thread Liu Ying
LDB channel1 should be registered if it is the only channel to be used.
Without this patch, imx_ldb_bind() would skip registering LDB channel1
if LDB channel0 is not used, no matter LDB channel1 needs to be used or
not.

Fixes: 8767f4711b2b (drm/imx: imx-ldb: move initialization into probe)
Signed-off-by: Liu Ying 
---
This patch fixes an issue introduced in v5.12-rc1.
It would be good to fix sooner than later.

 drivers/gpu/drm/imx/imx-ldb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
index dbfe39e..b794ed4 100644
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -583,7 +583,7 @@ static int imx_ldb_bind(struct device *dev, struct device 
*master, void *data)
struct imx_ldb_channel *channel = _ldb->channel[i];
 
if (!channel->ldb)
-   break;
+   continue;
 
ret = imx_ldb_register(drm, channel);
if (ret)
-- 
2.7.4



  1   2   3   4   5   6   >