[PATCH] wireless: iwlwifi: Fix a double free in iwl_txq_dyn_alloc_dma

2021-04-02 Thread Lv Yunlong
In iwl_txq_dyn_alloc_dma, txq->tfds is freed at first time by:
iwl_txq_alloc()->goto err_free_tfds->dma_free_coherent(). But
it forgot to set txq->tfds to NULL.

Then the txq->tfds is freed again in iwl_txq_dyn_alloc_dma by:
goto error->iwl_txq_gen2_free_memory()->dma_free_coherent().

My patch sets txq->tfds to NULL after the first free to avoid the
double free.

Fixes: 0cd1ad2d7fd41 ("iwlwifi: move all bus-independent TX functions to common 
code")
Signed-off-by: Lv Yunlong 
---
 drivers/net/wireless/intel/iwlwifi/queue/tx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/intel/iwlwifi/queue/tx.c 
b/drivers/net/wireless/intel/iwlwifi/queue/tx.c
index 833f43d1ca7a..99c8e473031a 100644
--- a/drivers/net/wireless/intel/iwlwifi/queue/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/queue/tx.c
@@ -1101,6 +1101,7 @@ int iwl_txq_alloc(struct iwl_trans *trans, struct iwl_txq 
*txq, int slots_num,
return 0;
 err_free_tfds:
dma_free_coherent(trans->dev, tfd_sz, txq->tfds, txq->dma_addr);
+   txq->tfds = NULL;
 error:
if (txq->entries && cmd_queue)
for (i = 0; i < slots_num; i++)
-- 
2.25.1




Re: [External] [PATCH 2/3] mm: Charge active memcg when no mm is set

2021-04-02 Thread Muchun Song
On Sat, Apr 3, 2021 at 3:17 AM Dan Schatzberg  wrote:
>
> set_active_memcg() worked for kernel allocations but was silently
> ignored for user pages.
>
> This patch establishes a precedence order for who gets charged:
>
> 1. If there is a memcg associated with the page already, that memcg is
>charged. This happens during swapin.
>
> 2. If an explicit mm is passed, mm->memcg is charged. This happens
>during page faults, which can be triggered in remote VMs (eg gup).
>
> 3. Otherwise consult the current process context. If there is an
>active_memcg, use that. Otherwise, current->mm->memcg.
>
> Previously, if a NULL mm was passed to mem_cgroup_charge (case 3) it
> would always charge the root cgroup. Now it looks up the active_memcg
> first (falling back to charging the root cgroup if not set).
>
> Signed-off-by: Dan Schatzberg 
> Acked-by: Johannes Weiner 
> Acked-by: Tejun Heo 
> Acked-by: Chris Down 
> Reviewed-by: Shakeel Butt 

Reviewed-by: Muchun Song 

Thanks.


Re: [PATCH] drm/amdgpu: Fix a potential sdma invalid access

2021-04-02 Thread Qu Huang

Hi Christian,

On 2021/4/3 0:25, Christian König wrote:

Hi Qu,

Am 02.04.21 um 05:18 schrieb Qu Huang:

Before dma_resv_lock(bo->base.resv, NULL) in amdgpu_bo_release_notify(),
the bo->base.resv lock may be held by ttm_mem_evict_first(),


That can't happen since when bo_release_notify is called the BO has not
more references and is therefore deleted.

And we never evict a deleted BO, we just wait for it to become idle.


Yes, the bo reference counter return to zero will enter
ttm_bo_release(),but notify bo release (call amdgpu_bo_release_notify())
first happen, and then test if a reservation object's fences have been
signaled, and then mark bo as deleted and remove bo from the LRU list.

When ttm_bo_release() and ttm_mem_evict_first() is concurrent,
the Bo has not been removed from the LRU list and is not marked as
deleted, this will happen.

As a test, when we use CPU memset instead of SDMA fill in
amdgpu_bo_release_notify(), the result is page fault:

PID: 5490   TASK: 8e8136e04100  CPU: 4   COMMAND: "gemmPerf"
  #0 [8e79eaa17970] machine_kexec at b2863784
  #1 [8e79eaa179d0] __crash_kexec at b291ce92
  #2 [8e79eaa17aa0] crash_kexec at b291cf80
  #3 [8e79eaa17ab8] oops_end at b2f6c768
  #4 [8e79eaa17ae0] no_context at b2f5aaa6
  #5 [8e79eaa17b30] __bad_area_nosemaphore at b2f5ab3d
  #6 [8e79eaa17b80] bad_area_nosemaphore at b2f5acae
  #7 [8e79eaa17b90] __do_page_fault at b2f6f6c0
  #8 [8e79eaa17c00] do_page_fault at b2f6f925
  #9 [8e79eaa17c30] page_fault at b2f6b758
 [exception RIP: memset+31]
 RIP: b2b8668f  RSP: 8e79eaa17ce8  RFLAGS: 00010a17
 RAX: bebebebebebebebe  RBX: 8e747bff10c0  RCX: 060b0020
 RDX:   RSI: 00be  RDI: ab807f00
 RBP: 8e79eaa17d10   R8: 8e79eaa14000   R9: ab7c8000
 R10: bcba  R11: 01ba  R12: 8e79ebaa4050
 R13: ab7c8000  R14: 00022600  R15: 8e8136e04100
 ORIG_RAX:   CS: 0010  SS: 0018
#10 [8e79eaa17ce8] amdgpu_bo_release_notify at c092f2d1 [amdgpu]
#11 [8e79eaa17d18] ttm_bo_release at c08f39dd [amdttm]
#12 [8e79eaa17d58] amdttm_bo_put at c08f3c8c [amdttm]
#13 [8e79eaa17d68] amdttm_bo_vm_close at c08f7ac9 [amdttm]
#14 [8e79eaa17d80] remove_vma at b29ef115
#15 [8e79eaa17da0] exit_mmap at b29f2c64
#16 [8e79eaa17e58] mmput at b28940c7
#17 [8e79eaa17e78] do_exit at b289dc95
#18 [8e79eaa17f10] do_group_exit at b289e4cf
#19 [8e79eaa17f40] sys_exit_group at b289e544
#20 [8e79eaa17f50] system_call_fastpath at b2f74ddb

Regards,
Qu.



Regards,
Christian.


and the VRAM mem will be evicted, mem region was replaced
by Gtt mem region. amdgpu_bo_release_notify() will then
hold the bo->base.resv lock, and SDMA will get an invalid
address in amdgpu_fill_buffer(), resulting in a VMFAULT
or memory corruption.

To avoid it, we have to hold bo->base.resv lock first, and
check whether the mem.mem_type is TTM_PL_VRAM.

Signed-off-by: Qu Huang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 4b29b82..8018574 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1300,12 +1300,16 @@ void amdgpu_bo_release_notify(struct
ttm_buffer_object *bo)
  if (bo->base.resv == >base._resv)
  amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);

-    if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node ||
-    !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
+    if (!(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
  return;

  dma_resv_lock(bo->base.resv, NULL);

+    if (bo->mem.mem_type != TTM_PL_VRAM || !bo->mem.mm_node) {
+    dma_resv_unlock(bo->base.resv);
+    return;
+    }
+
  r = amdgpu_fill_buffer(abo, AMDGPU_POISON, bo->base.resv, );
  if (!WARN_ON(r)) {
  amdgpu_bo_fence(abo, fence, false);
--
1.8.3.1





[PATCH] kfence: unpoison pool region before use

2021-04-02 Thread Peter Collingbourne
If the memory region allocated by KFENCE had previously been poisoned,
any validity checks done using kasan_byte_accessible() will fail. Fix
it by unpoisoning the memory before using it as the pool region.

Link: 
https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5
Signed-off-by: Peter Collingbourne 
---
 mm/kfence/core.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index d53c91f881a4..bb22b0cf77aa 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -633,13 +633,19 @@ static DECLARE_DELAYED_WORK(kfence_timer, 
toggle_allocation_gate);
 
 void __init kfence_alloc_pool(void)
 {
+   void *pool;
+
if (!kfence_sample_interval)
return;
 
-   __kfence_pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
-
-   if (!__kfence_pool)
+   pool = memblock_alloc(KFENCE_POOL_SIZE, PAGE_SIZE);
+   if (!pool) {
pr_err("failed to allocate pool\n");
+   return;
+   }
+
+   kasan_unpoison_range(pool, KFENCE_POOL_SIZE);
+   __kfence_pool = pool;
 }
 
 void __init kfence_init(void)
-- 
2.31.0.208.g409f899ff0-goog



Re: [PATCH] Kbuild: Update config_data.gz only if KCONFIG_CONFIG materially changed

2021-04-02 Thread Masahiro Yamada
On Fri, Apr 2, 2021 at 7:45 AM Elliot Berman  wrote:
>
> If you update the timestamp of KCONFIG_CONFIG without actually changing
> anything, config_data.gz is re-generated and causes vmlinux to re-link.
> When Link Time Optimization is enabled, unnecessary re-linking of
> vmlinux is highly desirable since it adds several minutes to build time.
>
> Avoid touching config_data.gz by using filechk to compare the existing
> config_data.gz and update only if it changed.
>
> The .config can be touched, for instance, by a build script which
> installs the default defconfig and then applies a defconfig fragment on
> top.
>
> For a simple example on my x86 machine, I modified x86 default defconfig to 
> set
> CONFIG_IKCONFIG=y and run:
>   make -j50 defconfig tiny.config vmlinux
>   make -j50 defconfig tiny.config vmlinux
> With this patch, vmlinux is not re-built as a result of config_data.gz
> change.
>
> Signed-off-by: Elliot Berman 
> ---
>  kernel/Makefile  | 2 +-
>  scripts/Makefile.lib | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 320f1f3..bd4e558 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -140,7 +140,7 @@ $(obj)/configs.o: $(obj)/config_data.gz
>
>  targets += config_data.gz
>  $(obj)/config_data.gz: $(KCONFIG_CONFIG) FORCE
> -   $(call if_changed,gzip)
> +   $(call filechk,gzip)


I do not think this is the right approach
because gzip is executed every time, even
if the time stamp is not changed.







>
>  $(obj)/kheaders.o: $(obj)/kheaders_data.tar.xz
>
> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> index a4fbaf8..81d3ec1 100644
> --- a/scripts/Makefile.lib
> +++ b/scripts/Makefile.lib
> @@ -282,6 +282,8 @@ cmd_objcopy = $(OBJCOPY) $(OBJCOPYFLAGS) 
> $(OBJCOPYFLAGS_$(@F)) $< $@
>  quiet_cmd_gzip = GZIP$@
>cmd_gzip = cat $(real-prereqs) | $(KGZIP) -n -f -9 > $@
>
> +filechk_gzip = cat $(real-prereqs) | $(KGZIP) -n -f -9
> +
>  # DTC
>  # ---
>  DTC ?= $(objtree)/scripts/dtc/dtc
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


-- 
Best Regards
Masahiro Yamada


[PATCH v3] IB/mlx5: Reduce max order of memory allocated for xlt update

2021-04-02 Thread Praveen Kumar Kannoju
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds). This
causes the calling application to hang for a long time, especially when the
system is fragmented.  To avoid these long latency spikes, the calls the
higher order allocations need to fail faster in case they are not
available. In order to acheive this we need __GFP_NORETRY flag in the
gfp_mask before during fetching the free pages. This patch adds this flag
to the mask.

Signed-off-by: Praveen Kumar Kannoju 
---
 drivers/infiniband/hw/mlx5/mr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index db05b0e..429e7aa6 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1028,7 +1028,7 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t 
ent_size, gfp_t gfp_mask)
 */
might_sleep();
 
-   gfp_mask |= __GFP_ZERO;
+   gfp_mask |= __GFP_ZERO | __GFP_NORETRY;
 
/*
 * If the system already has a suitable high order page then just use
-- 
1.8.3.1



[PATCH v3 9/9] w1: ds2438: support for writing to offset register

2021-04-02 Thread Luiz Sampaio
Added a sysfs entry to support writing to the offset register on page1.
This register is used to calibrate the chip canceling offset errors in the
current ADC. This means that, over time, reading the IAD register will not
return the correct current measurement, it will have an offset. Writing to
the offset register if the two's complement of the current register while
passing zero current to the load will calibrate the measurements. This
change was tested on real hardware and it was able to calibrate the chip
correctly.

Signed-off-by: Luiz Sampaio 
---
 Documentation/w1/slaves/w1_ds2438.rst | 11 +-
 drivers/w1/slaves/w1_ds2438.c | 49 +++
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/Documentation/w1/slaves/w1_ds2438.rst 
b/Documentation/w1/slaves/w1_ds2438.rst
index ac8d0d4b0d0e..5c5573991351 100644
--- a/Documentation/w1/slaves/w1_ds2438.rst
+++ b/Documentation/w1/slaves/w1_ds2438.rst
@@ -22,7 +22,7 @@ is also often used in weather stations and applications such 
as: rain gauge,
 wind speed/direction measuring, humidity sensing, etc.
 
 Current support is provided through the following sysfs files (all files
-except "iad" are readonly):
+except "iad" and "offset" are readonly):
 
 "iad"
 -
@@ -52,6 +52,15 @@ Internally when this file is read, the additional CRC byte 
is also obtained
 from the slave device. If it is correct, the 8 bytes page data are passed
 to userspace, otherwise an I/O error is returned.
 
+"offset"
+---
+This file controls the 2-byte Offset Register of the chip.
+Writing a 2-byte value will change the Offset Register, which changes the
+current measurement done by the chip. Changing this register to the two's 
complement
+of the current register while forcing zero current through the load will 
calibrate
+the chip, canceling offset errors in the current ADC.
+
+
 "temperature"
 -
 Opening and reading this file initiates the CONVERT_T (temperature conversion)
diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 2cfdfedb584f..223a9aae6e2d 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -193,6 +193,34 @@ static int w1_ds2438_change_config_bit(struct w1_slave 
*sl, u8 mask, u8 value)
return -1;
 }
 
+static int w1_ds2438_change_offset_register(struct w1_slave *sl, u8 *value)
+{
+   unsigned int retries = W1_DS2438_RETRIES;
+   u8 w1_buf[9];
+   u8 w1_page1_buf[DS2438_PAGE_SIZE + 1 /*for CRC*/];
+
+   if (w1_ds2438_get_page(sl, 1, w1_page1_buf) == 0) {
+   memcpy(_buf[2], w1_page1_buf, DS2438_PAGE_SIZE - 1); /* 
register 7 is reserved */
+   w1_buf[7] = value[0]; /* change only offset register */
+   w1_buf[8] = value[1];
+   while (retries--) {
+   if (w1_reset_select_slave(sl))
+   continue;
+   w1_buf[0] = W1_DS2438_WRITE_SCRATCH;
+   w1_buf[1] = 0x01; /* write to page 1 */
+   w1_write_block(sl->master, w1_buf, 9);
+
+   if (w1_reset_select_slave(sl))
+   continue;
+   w1_buf[0] = W1_DS2438_COPY_SCRATCH;
+   w1_buf[1] = 0x01;
+   w1_write_block(sl->master, w1_buf, 2);
+   return 0;
+   }
+   }
+   return -1;
+}
+
 static int w1_ds2438_get_voltage(struct w1_slave *sl,
 int adc_input, uint16_t *voltage)
 {
@@ -364,6 +392,25 @@ static ssize_t page1_read(struct file *filp, struct 
kobject *kobj,
return ret;
 }
 
+static ssize_t offset_write(struct file *filp, struct kobject *kobj,
+   struct bin_attribute *bin_attr, char *buf,
+   loff_t off, size_t count)
+{
+   struct w1_slave *sl = kobj_to_w1_slave(kobj);
+   int ret;
+
+   mutex_lock(>master->bus_mutex);
+
+   if (w1_ds2438_change_offset_register(sl, buf) == 0)
+   ret = count;
+   else
+   ret = -EIO;
+
+   mutex_unlock(>master->bus_mutex);
+
+   return ret;
+}
+
 static ssize_t temperature_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t off, size_t count)
@@ -430,6 +477,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
 static BIN_ATTR(iad, 0664, iad_read, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(page1, DS2438_PAGE_SIZE);
+static BIN_ATTR_WO(offset, 2);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
 static BIN_ATTR_RO(vdd, 0/* real length varies */);
@@ -438,6 +486,7 @@ static struct bin_attribute *w1_ds2438_bin_attrs[] = {
_attr_iad,
_attr_page0,
_attr_page1,
+   

[PATCH v3 8/9] w1: ds2438: adding support for reading page1

2021-04-02 Thread Luiz Sampaio
Added a sysfs entry to support reading the page1 registers. This registers
contain Elapsed Time Meter (ETM) data, which shows for how long the chip is
on, as well as an Offset Register data, which can be used to calibrate the
current measurement of the chip.

Signed-off-by: Luiz Sampaio 
---
 Documentation/w1/slaves/w1_ds2438.rst |  8 ++
 drivers/w1/slaves/w1_ds2438.c | 41 +++
 2 files changed, 49 insertions(+)

diff --git a/Documentation/w1/slaves/w1_ds2438.rst 
b/Documentation/w1/slaves/w1_ds2438.rst
index a29309a3f8e5..ac8d0d4b0d0e 100644
--- a/Documentation/w1/slaves/w1_ds2438.rst
+++ b/Documentation/w1/slaves/w1_ds2438.rst
@@ -44,6 +44,14 @@ Internally when this file is read, the additional CRC byte 
is also obtained
 from the slave device. If it is correct, the 8 bytes page data are passed
 to userspace, otherwise an I/O error is returned.
 
+"page1"
+---
+This file provides full 8 bytes of the chip Page 1 (01h).
+This page contains the ICA, elapsed time meter and current offset data of the 
DS2438.
+Internally when this file is read, the additional CRC byte is also obtained
+from the slave device. If it is correct, the 8 bytes page data are passed
+to userspace, otherwise an I/O error is returned.
+
 "temperature"
 -
 Opening and reading this file initiates the CONVERT_T (temperature conversion)
diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index ef6217ecb1cb..2cfdfedb584f 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -49,6 +49,15 @@
 #define DS2438_CURRENT_MSB 0x06
 #define DS2438_THRESHOLD   0x07
 
+/* Page #1 definitions */
+#define DS2438_ETM_0   0x00
+#define DS2438_ETM_1   0x01
+#define DS2438_ETM_2   0x02
+#define DS2438_ETM_3   0x03
+#define DS2438_ICA 0x04
+#define DS2438_OFFSET_LSB  0x05
+#define DS2438_OFFSET_MSB  0x06
+
 static int w1_ds2438_get_page(struct w1_slave *sl, int pageno, u8 *buf)
 {
unsigned int retries = W1_DS2438_RETRIES;
@@ -325,6 +334,36 @@ static ssize_t page0_read(struct file *filp, struct 
kobject *kobj,
return ret;
 }
 
+static ssize_t page1_read(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *bin_attr, char *buf,
+ loff_t off, size_t count)
+{
+   struct w1_slave *sl = kobj_to_w1_slave(kobj);
+   int ret;
+   u8 w1_buf[DS2438_PAGE_SIZE + 1 /*for CRC*/];
+
+   if (off != 0)
+   return 0;
+   if (!buf)
+   return -EINVAL;
+
+   mutex_lock(>master->bus_mutex);
+
+   /* Read no more than page1 size */
+   if (count > DS2438_PAGE_SIZE)
+   count = DS2438_PAGE_SIZE;
+
+   if (w1_ds2438_get_page(sl, 1, w1_buf) == 0) {
+   memcpy(buf, _buf, count);
+   ret = count;
+   } else
+   ret = -EIO;
+
+   mutex_unlock(>master->bus_mutex);
+
+   return ret;
+}
+
 static ssize_t temperature_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t off, size_t count)
@@ -390,6 +429,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
 
 static BIN_ATTR(iad, 0664, iad_read, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
+static BIN_ATTR_RO(page1, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
 static BIN_ATTR_RO(vdd, 0/* real length varies */);
@@ -397,6 +437,7 @@ static BIN_ATTR_RO(vdd, 0/* real length varies */);
 static struct bin_attribute *w1_ds2438_bin_attrs[] = {
_attr_iad,
_attr_page0,
+   _attr_page1,
_attr_temperature,
_attr_vad,
_attr_vdd,
-- 
2.30.1



[PATCH v3 7/9] w1: ds2438: fixing bug that would always get page0

2021-04-02 Thread Luiz Sampaio
The purpose of the w1_ds2438_get_page function is to get the register
values at the page passed as the pageno parameter. However, the page0 was
hardcoded, such that the function always returned the page0 contents. Fixed
so that the function can retrieve any page.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index ccb06b8c2d78..ef6217ecb1cb 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -62,13 +62,13 @@ static int w1_ds2438_get_page(struct w1_slave *sl, int 
pageno, u8 *buf)
if (w1_reset_select_slave(sl))
continue;
w1_buf[0] = W1_DS2438_RECALL_MEMORY;
-   w1_buf[1] = 0x00;
+   w1_buf[1] = (u8)pageno;
w1_write_block(sl->master, w1_buf, 2);
 
if (w1_reset_select_slave(sl))
continue;
w1_buf[0] = W1_DS2438_READ_SCRATCH;
-   w1_buf[1] = 0x00;
+   w1_buf[1] = (u8)pageno;
w1_write_block(sl->master, w1_buf, 2);
 
count = w1_read_block(sl->master, buf, DS2438_PAGE_SIZE + 1);
-- 
2.30.1



[PATCH v3 6/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Changed the permissions to preferred octal style.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 56e53a748059..ccb06b8c2d78 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -388,7 +388,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
return ret;
 }
 
-static BIN_ATTR(iad, S_IRUGO | S_IWUSR | S_IWGRP, iad_read, iad_write, 0);
+static BIN_ATTR(iad, 0664, iad_read, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
-- 
2.30.1



[PATCH v3 5/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index eeb593294fbd..56e53a748059 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -380,9 +380,9 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VDD, ) == 0) {
+   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VDD, ) == 0)
ret = snprintf(buf, count, "%u\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v3 4/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index eca50cec304f..eeb593294fbd 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -359,9 +359,9 @@ static ssize_t vad_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VAD, ) == 0) {
+   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VAD, ) == 0)
ret = snprintf(buf, count, "%u\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v3 3/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index a5bb53042c93..eca50cec304f 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -338,9 +338,9 @@ static ssize_t temperature_read(struct file *filp, struct 
kobject *kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_temperature(sl, ) == 0) {
+   if (w1_ds2438_get_temperature(sl, ) == 0)
ret = snprintf(buf, count, "%i\n", temp);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v3 1/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
There is an if statement and, if the function goes into it, it returns.
So, the next else is not required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 5cfb0ae23e91..148921fb9702 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -154,11 +154,11 @@ static int w1_ds2438_change_config_bit(struct w1_slave 
*sl, u8 mask, u8 value)
 
if ((status & mask) == value)
return 0;   /* already set as requested */
-   else {
-   /* changing bit */
-   status ^= mask;
-   perform_write = 1;
-   }
+
+   /* changing bit */
+   status ^= mask;
+   perform_write = 1;
+
break;
}
 
-- 
2.30.1



[PATCH v3 2/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 148921fb9702..a5bb53042c93 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -287,9 +287,9 @@ static ssize_t iad_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_current(sl, ) == 0) {
+   if (w1_ds2438_get_current(sl, ) == 0)
ret = snprintf(buf, count, "%i\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v3 0/9] w1: ds2438: adding support for calibration of current measurements

2021-04-02 Thread Luiz Sampaio
The following patches aim to make a user able to calibrate the current 
measurement of the DS2438. This chip uses a offset register in page1, which is 
added to the current register to give the user the current measurement. If this 
value is wrong, the user will get an offset current value, even if the current 
is zero, for instance. This patch gives support for reading the page1 registers 
(including the offset register) and for writing to the offset register. The 
DS2438 datasheet shows a calibration routine, and with this patch, the user can 
do this quickly by writing the correct value to the offset register. This patch 
was tested on real hardware using a power supply and an electronic load.
Please help to review this series of patches.

Best regards!
Sampaio
---
Changes in v3:
- I accidentally added a wrong line that would not compile. I'm sorry. Fixed it.

Changes in v2:
- Using git send-email to send the patches
- Adding documentation as requested
- Separating the coding style changes in different patches as requested

Luiz Sampaio (9):
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixing bug that would always get page0
  w1: ds2438: adding support for reading page1
  w1: ds2438: support for writing to offset register

 Documentation/w1/slaves/w1_ds2438.rst |  19 +++-
 drivers/w1/slaves/w1_ds2438.c | 122 ++
 2 files changed, 124 insertions(+), 17 deletions(-)

-- 
2.30.1



Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM

2021-04-02 Thread Akhil P Oommen

On 4/2/2021 3:19 AM, Rob Clark wrote:

On Thu, Apr 1, 2021 at 2:03 PM Dmitry Baryshkov
 wrote:


On Thu, 1 Apr 2021 at 23:09, Rob Clark  wrote:


On Mon, Feb 22, 2021 at 8:06 AM Rob Clark  wrote:


On Mon, Feb 22, 2021 at 7:45 AM Akhil P Oommen  wrote:


On 2/19/2021 9:30 PM, Rob Clark wrote:

On Fri, Feb 19, 2021 at 2:44 AM Akhil P Oommen  wrote:


On 2/18/2021 9:41 PM, Rob Clark wrote:

On Thu, Feb 18, 2021 at 4:28 AM Akhil P Oommen  wrote:


On 2/18/2021 2:05 AM, Jonathan Marek wrote:

On 2/17/21 3:18 PM, Rob Clark wrote:

On Wed, Feb 17, 2021 at 11:08 AM Jordan Crouse
 wrote:


On Wed, Feb 17, 2021 at 07:14:16PM +0530, Akhil P Oommen wrote:

On 2/17/2021 8:36 AM, Rob Clark wrote:

On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek 
wrote:


Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a
ENOENT error,
to fix the case where the kernel was compiled without CONFIG_NVMEM.

Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu")
Signed-off-by: Jonathan Marek 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index ba8e9d3cf0fe..7fe5d97606aa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct
device *dev, struct a6xx_gpu *a6xx_gpu,

cell = nvmem_cell_get(dev, "speed_bin");
/*
-* -ENOENT means that the platform doesn't support
speedbin which is
-* fine
+* -ENOENT means no speed bin in device tree,
+* -EOPNOTSUPP means kernel was built without CONFIG_NVMEM


very minor nit, it would be nice to at least preserve the gist of the
"which is fine" (ie. some variation of "this is an optional thing and
things won't catch fire without it" ;-))

(which is, I believe, is true, hopefully Akhil could confirm.. if not
we should have a harder dependency on CONFIG_NVMEM..)

IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw'
property,
we will see some error during boot up if we don't call
dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev,
"speed_bin")"
is a way to test this.

If there is no other harm, we can put a hard dependency on
CONFIG_NVMEM.


I'm not sure if we want to go this far given the squishiness about
module
dependencies. As far as I know we are the only driver that uses this
seriously
on QCOM SoCs and this is only needed for certain targets. I don't
know if we
want to force every target to build NVMEM and QFPROM on our behalf.
But maybe
I'm just saying that because Kconfig dependencies tend to break my
brain (and
then Arnd has to send a patch to fix it).



Hmm, good point.. looks like CONFIG_NVMEM itself doesn't have any
other dependencies, so I suppose it wouldn't be the end of the world
to select that.. but I guess we don't want to require QFPROM

I guess at the end of the day, what is the failure mode if you have a
speed-bin device, but your kernel config misses QFPROM (and possibly
NVMEM)?  If the result is just not having the highest clk rate(s)


Atleast on sc7180's gpu, using an unsupported FMAX breaks gmu. It won't
be very obvious what went wrong when this happens!


Ugg, ok..

I suppose we could select NVMEM, but not QFPROM, and then the case
where QFPROM is not enabled on platforms that have the speed-bin field
in DT will fail gracefully and all other platforms would continue on
happily?

BR,
-R


Sounds good to me.



You probably should do a quick test with NVMEM enabled but QFPROM
disabled to confirm my theory, but I *think* that should work

BR,
-R



I tried it on an sc7180 device. The suggested combo (CONFIG_NVMEM + no
CONFIG_QCOM_QFPROM) makes the gpu probe fail with error "failed to read
speed-bin. Some OPPs may not be supported by hardware". This is good
enough clue for the developer that he should fix the broken speedbin
detection.



Ok, great.. then sounds like selecting NVMEM is a good approach



btw, did anyone ever send a patch to select NVMEM?  I'm not seeing one
but I could be overlooking something
I thought Jonathan would send it as the discussion was going on in his 
patch. No problem, I will send it out. :)


-Akhil.



Judging by the amount of issues surrounding speed-bin, I might have a
bold suggestion to revert these patches for now and get them once all
the issues are sorted, so that we'd have a single working commit
instead of scattered patch series breaking git bisect, having bad
side-effects on non-sc7180 platforms, etc.



We do really need some pre-merge CI like we have on the mesa side of
things (and we at least have 845 devices in our CI farm, but it would
be useful to add more generations)..  but other than the config issue,
I *think* this fixes the last of the speedbin fallout?

https://patchwork.freedesktop.org/patch/426538/?series=88558=1

Planning to include that in a -fixes pull req in the next day or 

[PATCH] tty: use printk_safe context at tty_msg()

2021-04-02 Thread Tetsuo Handa
syzbot is reporting circular locking dependency due to calling printk()
with port lock held [1]. When this problem was reported, we worried
whether printk_safe context will remain available in future kernels [2],
and then this problem was forgotten. But in order to utilize syzbot's
resource for finding other bugs/reproducers by closing this one of top
crashers, let's apply a patch which counts on availability of printk_safe
context.

syzbot is also reporting same dependency due to memory allocation fault
injection at tty_buffer_alloc(). Although __GFP_NOWARN cannot prevent
memory allocation fault injection from calling printk(), let's use
__GFP_NOWARN at tty_buffer_alloc() in addition to using printk_safe
context, for generating many lines of messages due to warn_alloc() is
annoying. If we want to report it, we can use pr_warn() instead.

[1] 
https://syzkaller.appspot.com/bug?id=39ea6caa479af471183997376dc7e90bc7d64a6a
[2] https://lkml.kernel.org/r/20190218054649.GA26686@jagdpanzerIV

Reported-by: syzbot 
Reported-by: syzbot 
Signed-off-by: Tetsuo Handa 
Fixes: b6da31b2c07c46f2 ("tty: Fix data race in 
tty_insert_flip_string_fixed_flag")
Cc:  # 4.18+
---
 drivers/tty/tty_buffer.c | 5 -
 include/linux/tty.h  | 9 -
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 6d4995a5f318..d59f7873bc49 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -156,6 +156,7 @@ static struct tty_buffer *tty_buffer_alloc(struct tty_port 
*port, size_t size)
 {
struct llist_node *free;
struct tty_buffer *p;
+   unsigned long flags;
 
/* Round the buffer size out */
size = __ALIGN_MASK(size, TTYB_ALIGN_MASK);
@@ -172,7 +173,9 @@ static struct tty_buffer *tty_buffer_alloc(struct tty_port 
*port, size_t size)
   have queued and recycle that ? */
if (atomic_read(>buf.mem_used) > port->buf.mem_limit)
return NULL;
-   p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC);
+   printk_safe_enter_irqsave(flags);
+   p = kmalloc(sizeof(struct tty_buffer) + 2 * size, GFP_ATOMIC | 
__GFP_NOWARN);
+   printk_safe_exit_irqrestore(flags);
if (p == NULL)
return NULL;
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 95fc2f100f12..7ae8eb46fec3 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include <../../kernel/printk/internal.h>
 
 
 /*
@@ -773,7 +774,13 @@ static inline void proc_tty_unregister_driver(struct 
tty_driver *d) {}
 #endif
 
 #define tty_msg(fn, tty, f, ...) \
-   fn("%s %s: " f, tty_driver_name(tty), tty_name(tty), ##__VA_ARGS__)
+   do {\
+   unsigned long flags;\
+   \
+   printk_safe_enter_irqsave(flags);   \
+   fn("%s %s: " f, tty_driver_name(tty), tty_name(tty), 
##__VA_ARGS__); \
+   printk_safe_exit_irqrestore(flags); \
+   } while (0)
 
 #define tty_debug(tty, f, ...) tty_msg(pr_debug, tty, f, ##__VA_ARGS__)
 #define tty_info(tty, f, ...)  tty_msg(pr_info, tty, f, ##__VA_ARGS__)
-- 
2.18.4



Re: [PATCH v2 00/10] erofs: add big pcluster compression support

2021-04-02 Thread Chao Yu

On 2021/4/1 11:29, Gao Xiang wrote:

Hi folks,

This is the formal version of EROFS big pcluster support, which means
EROFS can compress data into more than 1 fs block after this patchset.

{l,p}cluster are EROFS-specific concepts, standing for `logical cluster'
and `physical cluster' correspondingly. Logical cluster is the basic unit
of compress indexes in file logical mapping, e.g. it can build compress
indexes in 2 blocks rather than 1 block (currently only 1 block lcluster
is supported). Physical cluster is a container of physical compressed
blocks which contains compressed data, the size of which is the multiple
of lclustersize.

Different from previous thoughts, which had fixed-sized pclusterblks
recorded in the on-disk compress index header, our on-disk design allows
variable-sized pclusterblks now. The main reasons are
  - user data varies in compression ratio locally, so fixed-sized
clustersize approach is space-wasting and causes extra read
amplificationfor high CR cases;

  - inplace decompression needs zero padding to guarantee its safe margin,
but we don't want to pad more than 1 fs block for big pcluster;

  - end users can now customize the pcluster size according to data type
since various pclustersize can exist in a file, for example, using
different pcluster size for executable code and one-shot data. such
design should be more flexible than many other public compression fses
(Btw, each file in EROFS can have maximum 2 algorithms at the same time
by using HEAD1/2, which will be formally added with LZMA support.)

In brief, EROFS can now compress from variable-sized input to
variable-sized pcluster blocks, as illustrated below:

   |<-_lcluster_->||<-_lcluster_->|
   |._|_ .. ___|___.__|
 ..
  . .
   .__.
   |__| .. |__|
   |<-  pcluster->|

The next step would be how to record the compressed block count in
lclusters. In compress indexes, there are 2 concepts called HEAD and
NONHEAD lclusters. The difference is that HEAD lcluster starts a new
pcluster in the lcluster, but NONHEAD not. It's easy to understand
that big pclusters at least have 2 pclusters, thus at least 2 lclusters
as well.

Therefore, let the delta0 (distance to its HEAD lcluster) of first NONHEAD
compress index store the compressed block count with a special flag as a
new called CBLKCNT compress index. It's also easy to know its delta0 is
constantly 1, as illustrated below:
   
  |_HEAD_|_CBLKCNT_|_NONHEAD_|_..._|_NONHEAD_|_HEAD | HEAD |
 |<-- a pcluster with CBLKCNT ->|<-- -->|
^ a pcluster with 1

If another HEAD follows a HEAD lcluster, there is no room to record
CBLKCNT, but it's easy to know the size of pcluster will be 1.

More implementation details about this and compact indexes are in the
commit message.

On the runtime performance side, the current EROFS test results are:
  
|  file system  |   size| seq read | rand read | rand9m read |
|___|___|_ MiB/s __|__ MiB/s __|___ MiB/s ___|
|___erofs_4k|_556879872_|_ 781.4 __|__ 55.3 ___|___ 25.3  ___|
|___erofs_16k___|_452509696_|_ 864.8 __|_ 123.2 ___|___ 20.8  ___|
|___erofs_32k___|_415223808_|_ 899.8 __|_ 105.8 _*_|___ 16.8 |
|___erofs_64k___|_393814016_|_ 906.6 __|__ 66.6 _*_|___ 11.8 |
|__squashfs_8k__|_556191744_|_  64.9 __|__ 19.3 ___| 9.1 |
|__squashfs_16k_|_502661120_|_  98.9 __|__ 38.0 ___| 9.8 |
|__squashfs_32k_|_458784768_|_ 115.4 __|__ 71.6 _*_|___ 10.0 |
|_squashfs_128k_|_398204928_|_ 257.2 __|_ 253.8 _*_|___ 10.9 |
|ext4_4k|()_|_ 786.6 __|__ 28.6 ___|___ 27.8 |


* Squashfs grabs more page cache to keep all decompressed data with
   grab_cache_page_nowait() than the normal requested readahead (see
   squashfs_copy_cache and squashfs_readpage_block).
   In principle, EROFS can also cache such all decompressed data
   if necessary, yet it's low priority for now and has little use
   (rand9m is actually a better rand read workload, since the amount
of I/O is 9m rather than full-sized 1000m).

More details are in
https://lore.kernel.org/r/20210329053654.ga3281...@xiangao.remote.csb

Also it's easy to know EROFS is not a fixed pcluster design, so users
can make several optimized strategy according to data type when mkfs.
And there is still room to optimize runtime performance for big pcluster
even further.

Finally, it passes ro_fsstress and can also successfully boot buildroot
& Android system with android-mainline repo.

current mkfs repo for big pcluster:

/usr/bin/ld: ll_temac_main.c:undefined reference to `devm_of_iomap'

2021-04-02 Thread kernel test robot
Hi Andre,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
commit: e8b6c54f6d57822e228027d41a1edb317034a08c net: xilinx: temac: Relax 
Kconfig dependencies
date:   1 year ago
config: um-randconfig-r004-20210403 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e8b6c54f6d57822e228027d41a1edb317034a08c
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout e8b6c54f6d57822e228027d41a1edb317034a08c
# save the attached .config to linux build tree
make W=1 ARCH=um 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x8): undefined 
reference to `X86_FEATURE_XMM2'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x15): undefined 
reference to `X86_FEATURE_XMM2'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x22): undefined 
reference to `X86_FEATURE_XMM'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x2f): undefined 
reference to `X86_FEATURE_XMM'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x3c): undefined 
reference to `X86_FEATURE_XMM'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x49): undefined 
reference to `X86_FEATURE_XMM'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x56): undefined 
reference to `X86_FEATURE_XMM'
   /usr/bin/ld: kernel/irq/generic-chip.o:(.altinstructions+0x63): more 
undefined references to `X86_FEATURE_XMM' follow
   /usr/bin/ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function 
`temac_probe':
   ll_temac_main.c:(.text+0x13e7): undefined reference to `devm_ioremap'
>> /usr/bin/ld: ll_temac_main.c:(.text+0x151b): undefined reference to 
>> `devm_of_iomap'
   /usr/bin/ld: ll_temac_main.c:(.text+0x1754): undefined reference to 
`devm_ioremap'
   /usr/bin/ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function 
`axienet_probe':
   xilinx_axienet_main.c:(.text+0xf8d): undefined reference to 
`devm_ioremap_resource'
   /usr/bin/ld: xilinx_axienet_main.c:(.text+0x1112): undefined reference to 
`devm_ioremap_resource'
   /usr/bin/ld: xilinx_axienet_main.c:(.text+0x1315): undefined reference to 
`devm_ioremap_resource'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x8): 
undefined reference to `X86_FEATURE_XMM2'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x15): 
undefined reference to `X86_FEATURE_XMM'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x22): 
undefined reference to `X86_FEATURE_XMM'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x2f): 
undefined reference to `X86_FEATURE_XMM2'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x3c): 
undefined reference to `X86_FEATURE_XMM'
   /usr/bin/ld: drivers/misc/altera-stapl/altera-lpt.o:(.altinstructions+0x49): 
undefined reference to `X86_FEATURE_XMM'
   /usr/bin/ld: drivers/mtd/nand/raw/nand_legacy.o:(.altinstructions+0x8): 
undefined reference to `X86_FEATURE_XMM2'
   /usr/bin/ld: drivers/mtd/nand/raw/nand_legacy.o:(.altinstructions+0x15): 
undefined reference to `X86_FEATURE_XMM2'
   collect2: error: ld returned 1 exit status

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v2 6/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread kernel test robot
Hi Luiz,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.12-rc5 next-20210401]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Luiz-Sampaio/w1-ds2438-adding-support-for-calibration-of-current-measurements/20210403-092618
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
config: powerpc64-randconfig-r026-20210402 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
0fe8af94688aa03c01913c2001d6a1a911f42ce6)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/6282c64dc6cf3656cc3a9034b04f22d2a655ad39
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Luiz-Sampaio/w1-ds2438-adding-support-for-calibration-of-current-measurements/20210403-092618
git checkout 6282c64dc6cf3656cc3a9034b04f22d2a655ad39
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/w1/slaves/w1_ds2438.c:391:40: error: too few arguments provided to 
>> function-like macro invocation
   static BIN_ATTR(iad, 0664, iad_write, 0);
  ^
   include/linux/sysfs.h:219:9: note: macro 'BIN_ATTR' defined here
   #define BIN_ATTR(_name, _mode, _read, _write, _size)\
   ^
>> drivers/w1/slaves/w1_ds2438.c:398:3: error: use of undeclared identifier 
>> 'bin_attr_iad'; did you mean 'bin_attr_vad'?
   _attr_iad,
^~~~
bin_attr_vad
   drivers/w1/slaves/w1_ds2438.c:394:8: note: 'bin_attr_vad' declared here
   static BIN_ATTR_RO(vad, 0/* real length varies */);
  ^
   include/linux/sysfs.h:224:22: note: expanded from macro 'BIN_ATTR_RO'
   struct bin_attribute bin_attr_##_name = __BIN_ATTR_RO(_name, _size)
^
   :49:1: note: expanded from here
   bin_attr_vad
   ^
   2 errors generated.


vim +391 drivers/w1/slaves/w1_ds2438.c

   390  
 > 391  static BIN_ATTR(iad, 0664, iad_write, 0);
   392  static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
   393  static BIN_ATTR_RO(temperature, 0/* real length varies */);
   394  static BIN_ATTR_RO(vad, 0/* real length varies */);
   395  static BIN_ATTR_RO(vdd, 0/* real length varies */);
   396  
   397  static struct bin_attribute *w1_ds2438_bin_attrs[] = {
 > 398  _attr_iad,
   399  _attr_page0,
   400  _attr_temperature,
   401  _attr_vad,
   402  _attr_vdd,
   403  NULL,
   404  };
   405  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v2 00/10] erofs: add big pcluster compression support

2021-04-02 Thread Gao Xiang
On Thu, Apr 01, 2021 at 11:29:44AM +0800, Gao Xiang wrote:
> Hi folks,
> 
> This is the formal version of EROFS big pcluster support, which means
> EROFS can compress data into more than 1 fs block after this patchset.
> 
> {l,p}cluster are EROFS-specific concepts, standing for `logical cluster'
> and `physical cluster' correspondingly. Logical cluster is the basic unit
> of compress indexes in file logical mapping, e.g. it can build compress
> indexes in 2 blocks rather than 1 block (currently only 1 block lcluster
> is supported). Physical cluster is a container of physical compressed
> blocks which contains compressed data, the size of which is the multiple
> of lclustersize.
> 
> Different from previous thoughts, which had fixed-sized pclusterblks
> recorded in the on-disk compress index header, our on-disk design allows
> variable-sized pclusterblks now. The main reasons are
>  - user data varies in compression ratio locally, so fixed-sized
>clustersize approach is space-wasting and causes extra read
>amplificationfor high CR cases;
> 
>  - inplace decompression needs zero padding to guarantee its safe margin,
>but we don't want to pad more than 1 fs block for big pcluster;
> 
>  - end users can now customize the pcluster size according to data type
>since various pclustersize can exist in a file, for example, using
>different pcluster size for executable code and one-shot data. such
>design should be more flexible than many other public compression fses
>(Btw, each file in EROFS can have maximum 2 algorithms at the same time
>by using HEAD1/2, which will be formally added with LZMA support.)
> 
> In brief, EROFS can now compress from variable-sized input to
> variable-sized pcluster blocks, as illustrated below:
> 
>   |<-_lcluster_->||<-_lcluster_->|
>   |._|_ .. ___|___.__|
> ..
>  . .
>   .__.
>   |__| .. |__|
>   |<-  pcluster->|
> 
> The next step would be how to record the compressed block count in
> lclusters. In compress indexes, there are 2 concepts called HEAD and
> NONHEAD lclusters. The difference is that HEAD lcluster starts a new
> pcluster in the lcluster, but NONHEAD not. It's easy to understand
> that big pclusters at least have 2 pclusters, thus at least 2 lclusters
> as well.
> 
> Therefore, let the delta0 (distance to its HEAD lcluster) of first NONHEAD
> compress index store the compressed block count with a special flag as a
> new called CBLKCNT compress index. It's also easy to know its delta0 is
> constantly 1, as illustrated below:
>   
>  |_HEAD_|_CBLKCNT_|_NONHEAD_|_..._|_NONHEAD_|_HEAD | HEAD |
> |<-- a pcluster with CBLKCNT ->|<-- -->|
>^ a pcluster with 1
> 
> If another HEAD follows a HEAD lcluster, there is no room to record
> CBLKCNT, but it's easy to know the size of pcluster will be 1.
> 
> More implementation details about this and compact indexes are in the
> commit message.
> 
> On the runtime performance side, the current EROFS test results are:
>  
> |  file system  |   size| seq read | rand read | rand9m read |
> |___|___|_ MiB/s __|__ MiB/s __|___ MiB/s ___|
> |___erofs_4k|_556879872_|_ 781.4 __|__ 55.3 ___|___ 25.3  ___|
> |___erofs_16k___|_452509696_|_ 864.8 __|_ 123.2 ___|___ 20.8  ___|
> |___erofs_32k___|_415223808_|_ 899.8 __|_ 105.8 _*_|___ 16.8 |
> |___erofs_64k___|_393814016_|_ 906.6 __|__ 66.6 _*_|___ 11.8 |
> |__squashfs_8k__|_556191744_|_  64.9 __|__ 19.3 ___| 9.1 |
> |__squashfs_16k_|_502661120_|_  98.9 __|__ 38.0 ___| 9.8 |
> |__squashfs_32k_|_458784768_|_ 115.4 __|__ 71.6 _*_|___ 10.0 |
> |_squashfs_128k_|_398204928_|_ 257.2 __|_ 253.8 _*_|___ 10.9 |
> |ext4_4k|()_|_ 786.6 __|__ 28.6 ___|___ 27.8 |
> 
> 
> * Squashfs grabs more page cache to keep all decompressed data with
>   grab_cache_page_nowait() than the normal requested readahead (see
>   squashfs_copy_cache and squashfs_readpage_block).
>   In principle, EROFS can also cache such all decompressed data
>   if necessary, yet it's low priority for now and has little use
>   (rand9m is actually a better rand read workload, since the amount
>of I/O is 9m rather than full-sized 1000m).
> 
> More details are in
> https://lore.kernel.org/r/20210329053654.ga3281...@xiangao.remote.csb
> 
> Also it's easy to know EROFS is not a fixed pcluster design, so users
> can make several optimized strategy according to data type when mkfs.
> And there is still room to optimize runtime performance for big pcluster
> even further.
> 
> Finally, it passes 

Re: [PATCH v2 6/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread kernel test robot
Hi Luiz,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.12-rc5 next-20210401]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Luiz-Sampaio/w1-ds2438-adding-support-for-calibration-of-current-measurements/20210403-092618
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
d93a0d43e3d0ba9e19387be4dae4a8d5b175a8d7
config: alpha-randconfig-r023-20210402 (attached as .config)
compiler: alpha-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/6282c64dc6cf3656cc3a9034b04f22d2a655ad39
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Luiz-Sampaio/w1-ds2438-adding-support-for-calibration-of-current-measurements/20210403-092618
git checkout 6282c64dc6cf3656cc3a9034b04f22d2a655ad39
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=alpha 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/w1/slaves/w1_ds2438.c:391:40: error: macro "BIN_ATTR" requires 5 
>> arguments, but only 4 given
 391 | static BIN_ATTR(iad, 0664, iad_write, 0);
 |^
   In file included from include/linux/kobject.h:20,
from include/linux/module.h:20,
from drivers/w1/slaves/w1_ds2438.c:9:
   include/linux/sysfs.h:219: note: macro "BIN_ATTR" defined here
 219 | #define BIN_ATTR(_name, _mode, _read, _write, _size)   \
 | 
>> drivers/w1/slaves/w1_ds2438.c:391:8: error: type defaults to 'int' in 
>> declaration of 'BIN_ATTR' [-Werror=implicit-int]
 391 | static BIN_ATTR(iad, 0664, iad_write, 0);
 |^~~~
>> drivers/w1/slaves/w1_ds2438.c:398:3: error: 'bin_attr_iad' undeclared here 
>> (not in a function); did you mean 'bin_attr_vad'?
 398 |  _attr_iad,
 |   ^~~~
 |   bin_attr_vad
   drivers/w1/slaves/w1_ds2438.c:391:8: warning: 'BIN_ATTR' defined but not 
used [-Wunused-variable]
 391 | static BIN_ATTR(iad, 0664, iad_write, 0);
 |^~~~
   drivers/w1/slaves/w1_ds2438.c:277:16: warning: 'iad_read' defined but not 
used [-Wunused-function]
 277 | static ssize_t iad_read(struct file *filp, struct kobject *kobj,
 |^~~~
   drivers/w1/slaves/w1_ds2438.c:255:16: warning: 'iad_write' defined but not 
used [-Wunused-function]
 255 | static ssize_t iad_write(struct file *filp, struct kobject *kobj,
 |^
   cc1: some warnings being treated as errors


vim +/BIN_ATTR +391 drivers/w1/slaves/w1_ds2438.c

   390  
 > 391  static BIN_ATTR(iad, 0664, iad_write, 0);
   392  static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
   393  static BIN_ATTR_RO(temperature, 0/* real length varies */);
   394  static BIN_ATTR_RO(vad, 0/* real length varies */);
   395  static BIN_ATTR_RO(vdd, 0/* real length varies */);
   396  
   397  static struct bin_attribute *w1_ds2438_bin_attrs[] = {
 > 398  _attr_iad,
   399  _attr_page0,
   400  _attr_temperature,
   401  _attr_vad,
   402  _attr_vdd,
   403  NULL,
   404  };
   405  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v3] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

2021-04-02 Thread Waiman Long

On 4/2/21 4:40 PM, Steven Rostedt wrote:

On Thu,  1 Apr 2021 14:10:30 -0400
Waiman Long  wrote:


The handling of sysrq key can be activated by echoing the key to
/proc/sysrq-trigger or via the magic key sequence typed into a terminal
that is connected to the system in some way (serial, USB or other mean).
In the former case, the handling is done in a user context. In the
latter case, it is likely to be in an interrupt context.

There should be no more than one instance of sysrq key processing via
a terminal, but multiple instances of /proc/sysrq-trigger is possible.

Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is
taken with interrupt disabled for the whole duration of the calls to
print_*_stats() and print_rq() which could last for the quite some time
if the information dump happens on the serial console.

If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic
depending on the actually serial console implementation of the
system. For instance,

Wouldn't placing strategically located "touch_nmi_watchdog()"s around fix
this?

-- Steve

The main problem with sched_debug_lock is that under certain 
circumstances, a lock waiter may wait a long time to acquire the lock 
(in seconds). We can't insert touch_nmi_watchdog() while the cpu is 
waiting for the spinlock.


Cheers,
Longman



[syzbot] WARNING: suspicious RCU usage in dput

2021-04-02 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:1e43c377 Merge tag 'xtensa-20210329' of git://github.com/j..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16d76301d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
dashboard link: https://syzkaller.appspot.com/bug?extid=bdef67a6b28a89e6fe71

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bdef67a6b28a89e6f...@syzkaller.appspotmail.com

=
WARNING: suspicious RCU usage
5.12.0-rc5-syzkaller #0 Not tainted
-
kernel/sched/core.c:8294 Illegal context switch in RCU-bh read-side critical 
section!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 0
no locks held by systemd-udevd/4825.

stack backtrace:
CPU: 1 PID: 4825 Comm: systemd-udevd Not tainted 5.12.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 ___might_sleep+0x229/0x2c0 kernel/sched/core.c:8294
 dput+0x4d/0xbc0 fs/dcache.c:870
 step_into+0x2cf/0x1c80 fs/namei.c:1778
 walk_component+0x171/0x6a0 fs/namei.c:1945
 link_path_walk.part.0+0x712/0xc90 fs/namei.c:2266
 link_path_walk fs/namei.c:2190 [inline]
 path_lookupat+0xb7/0x830 fs/namei.c:2419
 filename_lookup+0x19f/0x560 fs/namei.c:2453
 do_readlinkat+0xcd/0x2f0 fs/stat.c:417
 __do_sys_readlinkat fs/stat.c:444 [inline]
 __se_sys_readlinkat fs/stat.c:441 [inline]
 __x64_sys_readlinkat+0x93/0xf0 fs/stat.c:441
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fb5a7e200ba
Code: 48 8b 0d e1 bd 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 
00 00 00 0f 1f 44 00 00 49 89 ca b8 0b 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 8b 0d ae bd 2b 00 f7 d8 64 89 01 48
RSP: 002b:7ffc9c440e38 EFLAGS: 0202 ORIG_RAX: 010b
RAX: ffda RBX: 5604089e4380 RCX: 7fb5a7e200ba
RDX: 5604089e4380 RSI: 7ffc9c440ec0 RDI: ff9c
RBP: 0064 R08: 7fb5a80dcbc8 R09: 0070
R10: 0063 R11: 0202 R12: 7ffc9c440ec0
R13: ff9c R14: 7ffc9c440e90 R15: 0063


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


Re: [PATCH 0/3] Refine mtk-cmdq-mailbox callback mechanism

2021-04-02 Thread Chun-Kuang Hu
Hi, Jassi:

Chun-Kuang Hu  於 2021年3月15日 週一 上午7:33寫道:
>
> mtk-cmdq-mailbox use proprietary callback mechanism and proprietary
> error number, but these could be replaced by standard callback
> mechanism and standard error number. In addition, use cmdq_pkt as
> callback data to prevent redundnat assignment.
>
> Because client driver still use proprietary mechanism, so keep
> proprietary mechanism until client driver use the standard one.

How do you think about this series?

Regards,
Chun-Kuang.

>
> Chun-Kuang Hu (3):
>   mailbox: mtk-cmdq: Remove cmdq_cb_status
>   mailbox: mtk-cmdq: Use mailbox rx_callback
>   mailbox: mtk-cmdq: Add struct cmdq_pkt in struct cmdq_cb_data
>
>  drivers/mailbox/mtk-cmdq-mailbox.c   | 24 +++-
>  include/linux/mailbox/mtk-cmdq-mailbox.h |  8 ++--
>  2 files changed, 17 insertions(+), 15 deletions(-)
>
> --
> 2.17.1
>


Re: [PATCH 2/4] KVM: MIPS: rework flush_shadow_* callbacks into one that prepares the flush

2021-04-02 Thread Huacai Chen
Hi, Paolo,

TE mode has been removed in the MIPS tree, can we also remove it in
KVM tree before this rework?

Huacai

On Fri, Apr 2, 2021 at 11:58 PM Paolo Bonzini  wrote:
>
> Both trap-and-emulate and VZ have a single implementation that covers
> both .flush_shadow_all and .flush_shadow_memslot, and both of them end
> with a call to kvm_flush_remote_tlbs.
>
> Unify the callbacks into one and extract the call to kvm_flush_remote_tlbs.
> The next patches will pull it further out of the the architecture-specific
> MMU notifier functions kvm_unmap_hva_range and kvm_set_spte_hva.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/mips/include/asm/kvm_host.h |  9 +
>  arch/mips/kvm/mips.c | 12 ++--
>  arch/mips/kvm/mmu.c  |  9 ++---
>  arch/mips/kvm/trap_emul.c| 13 ++---
>  arch/mips/kvm/vz.c   | 19 ---
>  5 files changed, 19 insertions(+), 43 deletions(-)
>
> diff --git a/arch/mips/include/asm/kvm_host.h 
> b/arch/mips/include/asm/kvm_host.h
> index feaa77036b67..6c8c0ab53be2 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -815,14 +815,7 @@ struct kvm_mips_callbacks {
> int (*vcpu_init)(struct kvm_vcpu *vcpu);
> void (*vcpu_uninit)(struct kvm_vcpu *vcpu);
> int (*vcpu_setup)(struct kvm_vcpu *vcpu);
> -   void (*flush_shadow_all)(struct kvm *kvm);
> -   /*
> -* Must take care of flushing any cached GPA PTEs (e.g. guest entries 
> in
> -* VZ root TLB, or T GVA page tables and corresponding root TLB
> -* mappings).
> -*/
> -   void (*flush_shadow_memslot)(struct kvm *kvm,
> -const struct kvm_memory_slot *slot);
> +   void (*prepare_flush_shadow)(struct kvm *kvm);
> gpa_t (*gva_to_gpa)(gva_t gva);
> void (*queue_timer_int)(struct kvm_vcpu *vcpu);
> void (*dequeue_timer_int)(struct kvm_vcpu *vcpu);
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 7db8234a4407..867b8de0fc07 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -206,7 +206,8 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> kvm_mips_flush_gpa_pt(kvm, 0, ~0);
>
> /* Let implementation do the rest */
> -   kvm_mips_callbacks->flush_shadow_all(kvm);
> +   kvm_mips_callbacks->prepare_flush_shadow(kvm);
> +   kvm_flush_remote_tlbs(kvm);
>  }
>
>  void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
> @@ -221,8 +222,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
> /* Flush slot from GPA */
> kvm_mips_flush_gpa_pt(kvm, slot->base_gfn,
>   slot->base_gfn + slot->npages - 1);
> -   /* Let implementation do the rest */
> -   kvm_mips_callbacks->flush_shadow_memslot(kvm, slot);
> +   kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> spin_unlock(>mmu_lock);
>  }
>
> @@ -262,9 +262,8 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
> /* Write protect GPA page table entries */
> needs_flush = kvm_mips_mkclean_gpa_pt(kvm, new->base_gfn,
> new->base_gfn + new->npages - 1);
> -   /* Let implementation do the rest */
> if (needs_flush)
> -   kvm_mips_callbacks->flush_shadow_memslot(kvm, new);
> +   kvm_arch_flush_remote_tlbs_memslot(kvm, new);
> spin_unlock(>mmu_lock);
> }
>  }
> @@ -1000,7 +999,8 @@ void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> const struct kvm_memory_slot *memslot)
>  {
> /* Let implementation handle TLB/GVA invalidation */
> -   kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot);
> +   kvm_mips_callbacks->prepare_flush_shadow(kvm);
> +   kvm_flush_remote_tlbs(kvm);
>  }
>
>  long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long 
> arg)
> diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c
> index 3dabeda82458..7e055e5dfd3c 100644
> --- a/arch/mips/kvm/mmu.c
> +++ b/arch/mips/kvm/mmu.c
> @@ -491,7 +491,8 @@ int kvm_unmap_hva_range(struct kvm *kvm, unsigned long 
> start, unsigned long end,
>  {
> handle_hva_to_gpa(kvm, start, end, _unmap_hva_handler, NULL);
>
> -   kvm_mips_callbacks->flush_shadow_all(kvm);
> +   kvm_mips_callbacks->prepare_flush_shadow(kvm);
> +   kvm_flush_remote_tlbs(kvm);
> return 0;
>  }
>
> @@ -532,8 +533,10 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
> pte_t pte)
> int ret;
>
> ret = handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, );
> -   if (ret)
> -   kvm_mips_callbacks->flush_shadow_all(kvm);
> +   if (ret) {
> +   kvm_mips_callbacks->prepare_flush_shadow(kvm);
> +   kvm_flush_remote_tlbs(kvm);
> +   }
> return 0;
>  }
>
> diff --git 

Re: [PATCH 1/4] KVM: constify kvm_arch_flush_remote_tlbs_memslot

2021-04-02 Thread Huacai Chen
Reviewed-by: Huacai Chen 

On Fri, Apr 2, 2021 at 11:58 PM Paolo Bonzini  wrote:
>
> memslots are stored in RCU and there should be no need to
> change them.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/arm64/kvm/arm.c | 2 +-
>  arch/mips/kvm/mips.c | 2 +-
>  arch/x86/kvm/mmu/mmu.c   | 2 +-
>  include/linux/kvm_host.h | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 7f06ba76698d..9141730b7963 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1268,7 +1268,7 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct 
> kvm_memory_slot *memslot)
>  }
>
>  void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> -   struct kvm_memory_slot *memslot)
> +   const struct kvm_memory_slot *memslot)
>  {
> kvm_flush_remote_tlbs(kvm);
>  }
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 58a8812e2fa5..7db8234a4407 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -997,7 +997,7 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct 
> kvm_memory_slot *memslot)
>  }
>
>  void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> -   struct kvm_memory_slot *memslot)
> +   const struct kvm_memory_slot *memslot)
>  {
> /* Let implementation handle TLB/GVA invalidation */
> kvm_mips_callbacks->flush_shadow_memslot(kvm, memslot);
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 0d92a269c5fa..f75c677910c9 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5616,7 +5616,7 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
>  }
>
>  void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> -   struct kvm_memory_slot *memslot)
> +   const struct kvm_memory_slot *memslot)
>  {
> /*
>  * All current use cases for flushing the TLBs for a specific memslot
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index e6d77353025c..34a974ffc882 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -894,7 +894,7 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct 
> kvm_memory_slot *memslot);
>
>  #ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> -   struct kvm_memory_slot *memslot);
> +   const struct kvm_memory_slot 
> *memslot);
>  #else /* !CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log);
>  int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log,
> --
> 2.30.1
>
>


[syzbot] WARNING: suspicious RCU usage in find_inlist_lock

2021-04-02 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:1e43c377 Merge tag 'xtensa-20210329' of git://github.com/j..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=114cdd4ad0
kernel config:  https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
dashboard link: https://syzkaller.appspot.com/bug?extid=b221933e5f9ad5b0e2fd

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b221933e5f9ad5b0e...@syzkaller.appspotmail.com

=
WARNING: suspicious RCU usage
5.12.0-rc5-syzkaller #0 Not tainted
-
kernel/sched/core.c:8294 Illegal context switch in RCU-sched read-side critical 
section!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 0
no locks held by syz-executor.1/8425.

stack backtrace:
CPU: 1 PID: 8425 Comm: syz-executor.1 Not tainted 5.12.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 ___might_sleep+0x266/0x2c0 kernel/sched/core.c:8294
 __mutex_lock_common kernel/locking/mutex.c:928 [inline]
 __mutex_lock+0xa9/0x1120 kernel/locking/mutex.c:1096
 find_inlist_lock_noload net/bridge/netfilter/ebtables.c:316 [inline]
 find_inlist_lock.constprop.0+0x26/0x220 net/bridge/netfilter/ebtables.c:330
 find_table_lock net/bridge/netfilter/ebtables.c:339 [inline]
 do_ebt_get_ctl+0x208/0x790 net/bridge/netfilter/ebtables.c:2329
 nf_getsockopt+0x72/0xd0 net/netfilter/nf_sockopt.c:116
 ip_getsockopt net/ipv4/ip_sockglue.c:1777 [inline]
 ip_getsockopt+0x164/0x1c0 net/ipv4/ip_sockglue.c:1756
 tcp_getsockopt+0x86/0xd0 net/ipv4/tcp.c:4239
 __sys_getsockopt+0x21f/0x5f0 net/socket.c:2161
 __do_sys_getsockopt net/socket.c:2176 [inline]
 __se_sys_getsockopt net/socket.c:2173 [inline]
 __x64_sys_getsockopt+0xba/0x150 net/socket.c:2173
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x467a6a
Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 
00 00 00 0f 1f 44 00 00 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffe660d82f8 EFLAGS: 0202 ORIG_RAX: 0037
RAX: ffda RBX: 005401a0 RCX: 00467a6a
RDX: 0081 RSI:  RDI: 0003
RBP:  R08: 7ffe660d831c R09: 7ffe660d83a0
R10: 7ffe660d8320 R11: 0202 R12: 0003
R13: 7ffe660d8320 R14: 00540128 R15: 7ffe660d831c


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


[PATCH] powerpc/dts: fix not include DTC_FLAGS

2021-04-02 Thread Youlin Song
I wanted to build the fsl dts in my machine and found that
the dtb have not extra space,so uboot will cause about
FDT_ERR_NOSPACE issue.

Signed-off-by: Youlin Song 
---
 arch/powerpc/boot/dts/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/boot/dts/Makefile b/arch/powerpc/boot/dts/Makefile
index fb335d05aae8..c21165c0cd76 100644
--- a/arch/powerpc/boot/dts/Makefile
+++ b/arch/powerpc/boot/dts/Makefile
@@ -2,5 +2,6 @@
 
 subdir-y += fsl
 
+DTC_FLAGS   ?= -p 1024
 dtstree:= $(srctree)/$(src)
 dtb-$(CONFIG_OF_ALL_DTBS) := $(patsubst $(dtstree)/%.dts,%.dtb, $(wildcard 
$(dtstree)/*.dts))
-- 
2.25.1



[PATCH net 2/2] net: hns3: Remove un-necessary 'else-if' in the hclge_reset_event()

2021-04-02 Thread Salil Mehta
Code to defer the reset(which caps the frequency of the reset) schedules the
timer and returns. Hence, following 'else-if' looks un-necessary.

Fixes: 9de0b86f6444 ("net: hns3: Prevent to request reset frequently")
Signed-off-by: Salil Mehta 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 7ad0722383f5..2ed464e13d1b 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3981,7 +3981,9 @@ static void hclge_reset_event(struct pci_dev *pdev, 
struct hnae3_handle *handle)
  HCLGE_RESET_INTERVAL))) {
mod_timer(>reset_timer, jiffies + HCLGE_RESET_INTERVAL);
return;
-   } else if (hdev->default_reset_request) {
+   }
+
+   if (hdev->default_reset_request) {
hdev->reset_level =
hclge_get_reset_level(ae_dev,
  >default_reset_request);
-- 
2.17.1



[PATCH net 1/2] net: hns3: Remove the left over redundant check & assignment

2021-04-02 Thread Salil Mehta
This removes the left over check and assignment which is no longer used
anywhere in the function and should have been removed as part of the
below mentioned patch.

Fixes: 012fcb52f67c ("net: hns3: activate reset timer when calling reset_event")
Signed-off-by: Salil Mehta 
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 
b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index e3f81c7e0ce7..7ad0722383f5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3976,8 +3976,6 @@ static void hclge_reset_event(struct pci_dev *pdev, 
struct hnae3_handle *handle)
 * want to make sure we throttle the reset request. Therefore, we will
 * not allow it again before 3*HZ times.
 */
-   if (!handle)
-   handle = >vport[0].nic;
 
if (time_before(jiffies, (hdev->last_reset_time +
  HCLGE_RESET_INTERVAL))) {
-- 
2.17.1



[PATCH net 0/2] Misc. fixes for hns3 driver

2021-04-02 Thread Salil Mehta
Fixes for the miscellaneous problems found during the review of the code.

Salil Mehta (2):
  net: hns3: Remove the left over redundant check & assignment
  net: hns3: Remove un-necessary 'else-if' in the hclge_reset_event()

 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.17.1



RE: [EXTERNAL] Re: [PATCH] ASoC: max98390: Add controls for tx path

2021-04-02 Thread Steve Lee
> -Original Message-
> From: Mark Brown 
> Sent: Saturday, April 3, 2021 12:23 AM
> To: Steve Lee 
> Cc: lgirdw...@gmail.com; pe...@perex.cz; ti...@suse.com;
> ckee...@opensource.cirrus.com; ge...@linux-m68k.org;
> r...@opensource.wolfsonmicro.com; shumi...@realtek.com;
> srinivas.kandaga...@linaro.org; k...@kernel.org; dmur...@ti.com;
> jack...@realtek.com; nuno...@analog.com; linux-kernel@vger.kernel.org;
> alsa-de...@alsa-project.org; ryan.lee.ma...@gmail.com;
> steves.lee.ma...@gmail.com
> Subject: [EXTERNAL] Re: [PATCH] ASoC: max98390: Add controls for tx path
> 
> On Fri, Apr 02, 2021 at 12:36:43PM +0900, Steve Lee wrote:
> 
> > +   SOC_SINGLE("Tx Enable Selection", MAX98390_PCM_TX_EN_A,
> > +   0, 255, 0),
> 
> I'm not clear what this is (especially given the source selection below) but 
> it
> looks like it should be a mute control?
Yes, each channel of enable and disable control. 
I will update this also configured by TDM slot configuration.  
> 
> > +   SOC_SINGLE("Tx Hiz Selection", MAX98390_PCM_TX_HIZ_CTRL_A,
> > +   0, 255, 0),
> 
> This I'd expect to be tied into machine driver configuration, either DT 
> properties
> or TDM slot configuration - it's not something that looks like it's something
> you'd want to control at runtime.
I will update this with either TDM slot configuration and DT properties.
> 
> > +   SOC_SINGLE("Tx Source Selection", MAX98390_PCM_CH_SRC_2,
> > +   0, 255, 0),
> 
> This looks like it should be a DAPM control or possibly a TDM slot 
> configuration -
> look at how the Arizona devices handle routing from multiple TDM slots for the
> DAPM version.
This is for Current Sensing and Voltage sensing slot selection. I will update 
this as DT properties.


Re: [syzbot] KASAN: use-after-free Read in fw_load_sysfs_fallback

2021-04-02 Thread Luis Chamberlain
On Fri, Apr 02, 2021 at 02:41:20PM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:5ee96fa9 Merge tag 'irq-urgent-2021-03-21' of git://git.ke..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1028d621d0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c51293a9ca630f6d
> dashboard link: https://syzkaller.appspot.com/bug?extid=9b91d635e2b51efd6371
> compiler:   Debian clang version 11.0.1-2
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1794dad6d0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11f7afe6d0
> 
> The issue was bisected to:
> 
> commit bcfbd3523f3c6eea51a74d217a8ebc5463bcb7f4
> Author: Junyong Sun 
> Date:   Tue Mar 3 02:36:08 2020 +
> 
> firmware: fix a double abort case with fw_load_sysfs_fallback
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17e4778ad0
> final oops: https://syzkaller.appspot.com/x/report.txt?x=1414778ad0
> console output: https://syzkaller.appspot.com/x/log.txt?x=1014778ad0
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+9b91d635e2b51efd6...@syzkaller.appspotmail.com
> Fixes: bcfbd3523f3c ("firmware: fix a double abort case with 
> fw_load_sysfs_fallback")
> 
> platform regulatory.0: Direct firmware load for regulatory.db failed with 
> error -2
> platform regulatory.0: Falling back to sysfs fallback for: regulatory.db
> ==
> BUG: KASAN: use-after-free in __list_add_valid+0x36/0xc0 lib/list_debug.c:23
> Read of size 8 at addr 888014188ac8 by task syz-executor310/9819
> 
> CPU: 0 PID: 9819 Comm: syz-executor310 Not tainted 5.12.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0x176/0x24e lib/dump_stack.c:120
>  print_address_description+0x5f/0x3a0 mm/kasan/report.c:232
>  __kasan_report mm/kasan/report.c:399 [inline]
>  kasan_report+0x15c/0x200 mm/kasan/report.c:416
>  __list_add_valid+0x36/0xc0 lib/list_debug.c:23
>  __list_add include/linux/list.h:67 [inline]
>  list_add include/linux/list.h:86 [inline]
>  fw_load_sysfs_fallback+0x110/0x720 
> drivers/base/firmware_loader/fallback.c:516
>  fw_load_from_user_helper+0x242/0x320 
> drivers/base/firmware_loader/fallback.c:581
>  _request_firmware+0x2c5/0x4c0 drivers/base/firmware_loader/main.c:831
>  request_firmware+0x35/0x50 drivers/base/firmware_loader/main.c:875
>  reg_reload_regdb+0x3a/0x1b0 net/wireless/reg.c:1095
>  genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
>  genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
>  genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800
>  netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2502
>  genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
>  netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
>  netlink_unicast+0x786/0x940 net/netlink/af_netlink.c:1338
>  netlink_sendmsg+0x9ae/0xd50 net/netlink/af_netlink.c:1927
>  sock_sendmsg_nosec net/socket.c:654 [inline]
>  sock_sendmsg net/socket.c:674 [inline]
>  sys_sendmsg+0x519/0x800 net/socket.c:2350
>  ___sys_sendmsg net/socket.c:2404 [inline]
>  __sys_sendmsg+0x2bf/0x370 net/socket.c:2433
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x44ab59
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 16 00 00 90 48 89 f8 48 89 f7 48 
> 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 
> 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:7fa2b8ba9318 EFLAGS: 0246 ORIG_RAX: 002e
> RAX: ffda RBX: 004d62a8 RCX: 0044ab59
> RDX:  RSI: 21c0 RDI: 0003
> RBP: 004d62a0 R08: 0099 R09: 
> R10: 000c R11: 0246 R12: 0031313230386c6e
> R13: 7ffeeaba390f R14: 7fa2b8ba9400 R15: 00022000
> 
> Allocated by task 9818:
>  kasan_save_stack mm/kasan/common.c:38 [inline]
>  kasan_set_track mm/kasan/common.c:46 [inline]
>  set_alloc_info mm/kasan/common.c:427 [inline]
>  kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506
>  kasan_kmalloc include/linux/kasan.h:233 [inline]
>  kmem_cache_alloc_trace+0x21b/0x350 mm/slub.c:2934
>  kmalloc include/linux/slab.h:554 [inline]
>  kzalloc include/linux/slab.h:684 [inline]
>  __allocate_fw_priv+0x98/0x340 drivers/base/firmware_loader/main.c:186
>  alloc_lookup_fw_priv+0x1c3/0x380 drivers/base/firmware_loader/main.c:250
>  _request_firmware_prepare+0x23c/0x5b0 drivers/base/firmware_loader/main.c:744
>  _request_firmware+0xd9/0x4c0 drivers/base/firmware_loader/main.c:806
>  request_firmware+0x35/0x50 drivers/base/firmware_loader/main.c:875
>  reg_reload_regdb+0x3a/0x1b0 net/wireless/reg.c:1095
>  

Re: [PATCH 9/9] sched: prctl() and cgroup interaction

2021-04-02 Thread Josh Don
Hi Peter,

I walked through the reference counting, and it seems good to me
(though it did take a few passes to fully digest the invariants for
the fat cookie stuff).

> +unsigned long sched_core_alloc_cookie(unsigned int type)
>  {
> struct sched_core_cookie *ck = kmalloc(sizeof(*ck), GFP_KERNEL);
> if (!ck)
> return 0;
> -   refcount_set(>refcnt, 1);
> +   WARN_ON_ONCE(type > GROUP_COOKIE);
> +   sched_core_init_cookie(ck, type);
> sched_core_get();
>
> -   return (unsigned long)ck;
> +   return (unsigned long)ck | type;
 > }

This feels like it needs to be stronger than a WARN_ON_ONCE; could
create a corrupted address that we later try to kfree().

Also, for my own edification, why will the bottom two bits here always be 0?

> -unsigned long sched_core_alloc_cookie(void)
> +static inline void *cookie_ptr(unsigned long cookie)
> +{
> +   return (void *)(cookie & ~3UL);
> +}
> +
> +static inline int cookie_type(unsigned long cookie)
> +{
> +   return cookie & 3;
> +}

s/3/FAT_COOKIE

> +#define FAT_COOKIE 0x03

Move to sched.h to group with TASK/GROUP_COOKIE?

> +static unsigned long __sched_core_fat_cookie(struct task_struct *p,
> +void **spare_fat,
> +unsigned long cookie)
> +{

This function looks good to me, but could use some more comments about
the pre/post-condition assumptions. Ie. cookie already has a get()
associated with it, caller is expected to kfree the spare_fat.

> +   raw_spin_lock_irqsave(_lock, flags);
> +   n = rb_find_add(>node, _root, fat_cmp);
> +   raw_spin_unlock_irqrestore(_lock, flags);
> +
> +   if (n) {
> +   sched_core_put_fat(fat);
> +   fat = node_2_fat(n);

This put() doesn't seem strictly necessary; caller will be
unconditionally freeing the spare_fat. Keep anyway for completeness,
but add a comment?


Re: CFI violation in drivers/infiniband/core/sysfs.c

2021-04-02 Thread Kees Cook
On Fri, Apr 02, 2021 at 08:30:18PM -0300, Jason Gunthorpe wrote:
> On Fri, Apr 02, 2021 at 04:03:30PM -0700, Kees Cook wrote:
> 
> > > relevant. It seems to me that the hw_counters 'struct attribute_group'
> > > should probably be its own kobj within both of these structures so they
> > > can have their own sysfs ops (unless there is some other way to do this
> > > that I am missing).
> 
> Err, yes, every subclass of the attribute should be accompanied by a
> distinct kobject type to relay the show methods with typesafety, this
> is how this design pattern is intended to be used.
> 
> If I understand your report properly the hw_stats_attribute is being
> assigned to a 'port_type' kobject and it only works by pure luck because
> the show/store happens to overlap between port and hsa attributes?

"happens to" :) Yeah, they're all like this, unfortunately:
https://lore.kernel.org/lkml/202006112217.2E6CE093@keescook/

This is the first that I've seen that crossed kobj_types in the same
group, though. :)

> > > I would appreciate someone else taking a look and seeing if I am off
> > > base or if there is an easier way to solve this.
> > 
> > So, it seems that the reason for a custom kobj_type here is to use the
> > .release callback. 
> 
> Every kobject should be associated with a specific, fixed, attribute
> type. The purpose of the wrappers is to inject type safety so the
> attribute implementations know they are working on the right stuff.

Right -- though it's not specifically required to be a fixed attribute.
It can just be a "generic" kobject. This seems to happen a lot when
something wants to show up a global or const value in /sys

> The answer is that the setup_hw_stats_() for port and device must
> be split up and the attribute implementations for each of them have to
> subclass starting from the correct type.

I'm not convinced that just backing everything off to kobject isn't
simpler?

> And then two show/set functions that bounce through the correct types
> to the data.

I'd like to make these things compile-time safe (there is not type
associated with use the __ATTR() macro, for example). That I haven't
really figured out how to do right.

-- 
Kees Cook


[PATCH v2 9/9] w1: ds2438: support for writing to offset register

2021-04-02 Thread Luiz Sampaio
Added a sysfs entry to support writing to the offset register on page1.
This register is used to calibrate the chip canceling offset errors in the
current ADC. This means that, over time, reading the IAD register will not
return the correct current measurement, it will have an offset. Writing to
the offset register if the two's complement of the current register while
passing zero current to the load will calibrate the measurements. This
change was tested on real hardware and it was able to calibrate the chip
correctly.

Signed-off-by: Luiz Sampaio 
---
 Documentation/w1/slaves/w1_ds2438.rst | 11 +-
 drivers/w1/slaves/w1_ds2438.c | 49 +++
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/Documentation/w1/slaves/w1_ds2438.rst 
b/Documentation/w1/slaves/w1_ds2438.rst
index ac8d0d4b0d0e..5c5573991351 100644
--- a/Documentation/w1/slaves/w1_ds2438.rst
+++ b/Documentation/w1/slaves/w1_ds2438.rst
@@ -22,7 +22,7 @@ is also often used in weather stations and applications such 
as: rain gauge,
 wind speed/direction measuring, humidity sensing, etc.
 
 Current support is provided through the following sysfs files (all files
-except "iad" are readonly):
+except "iad" and "offset" are readonly):
 
 "iad"
 -
@@ -52,6 +52,15 @@ Internally when this file is read, the additional CRC byte 
is also obtained
 from the slave device. If it is correct, the 8 bytes page data are passed
 to userspace, otherwise an I/O error is returned.
 
+"offset"
+---
+This file controls the 2-byte Offset Register of the chip.
+Writing a 2-byte value will change the Offset Register, which changes the
+current measurement done by the chip. Changing this register to the two's 
complement
+of the current register while forcing zero current through the load will 
calibrate
+the chip, canceling offset errors in the current ADC.
+
+
 "temperature"
 -
 Opening and reading this file initiates the CONVERT_T (temperature conversion)
diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 2cfdfedb584f..223a9aae6e2d 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -193,6 +193,34 @@ static int w1_ds2438_change_config_bit(struct w1_slave 
*sl, u8 mask, u8 value)
return -1;
 }
 
+static int w1_ds2438_change_offset_register(struct w1_slave *sl, u8 *value)
+{
+   unsigned int retries = W1_DS2438_RETRIES;
+   u8 w1_buf[9];
+   u8 w1_page1_buf[DS2438_PAGE_SIZE + 1 /*for CRC*/];
+
+   if (w1_ds2438_get_page(sl, 1, w1_page1_buf) == 0) {
+   memcpy(_buf[2], w1_page1_buf, DS2438_PAGE_SIZE - 1); /* 
register 7 is reserved */
+   w1_buf[7] = value[0]; /* change only offset register */
+   w1_buf[8] = value[1];
+   while (retries--) {
+   if (w1_reset_select_slave(sl))
+   continue;
+   w1_buf[0] = W1_DS2438_WRITE_SCRATCH;
+   w1_buf[1] = 0x01; /* write to page 1 */
+   w1_write_block(sl->master, w1_buf, 9);
+
+   if (w1_reset_select_slave(sl))
+   continue;
+   w1_buf[0] = W1_DS2438_COPY_SCRATCH;
+   w1_buf[1] = 0x01;
+   w1_write_block(sl->master, w1_buf, 2);
+   return 0;
+   }
+   }
+   return -1;
+}
+
 static int w1_ds2438_get_voltage(struct w1_slave *sl,
 int adc_input, uint16_t *voltage)
 {
@@ -364,6 +392,25 @@ static ssize_t page1_read(struct file *filp, struct 
kobject *kobj,
return ret;
 }
 
+static ssize_t offset_write(struct file *filp, struct kobject *kobj,
+   struct bin_attribute *bin_attr, char *buf,
+   loff_t off, size_t count)
+{
+   struct w1_slave *sl = kobj_to_w1_slave(kobj);
+   int ret;
+
+   mutex_lock(>master->bus_mutex);
+
+   if (w1_ds2438_change_offset_register(sl, buf) == 0)
+   ret = count;
+   else
+   ret = -EIO;
+
+   mutex_unlock(>master->bus_mutex);
+
+   return ret;
+}
+
 static ssize_t temperature_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t off, size_t count)
@@ -430,6 +477,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
 static BIN_ATTR(iad, 0664, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(page1, DS2438_PAGE_SIZE);
+static BIN_ATTR_WO(offset, 2);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
 static BIN_ATTR_RO(vdd, 0/* real length varies */);
@@ -438,6 +486,7 @@ static struct bin_attribute *w1_ds2438_bin_attrs[] = {
_attr_iad,
_attr_page0,
_attr_page1,
+   _attr_offset,

[PATCH v2 8/9] w1: ds2438: adding support for reading page1

2021-04-02 Thread Luiz Sampaio
Added a sysfs entry to support reading the page1 registers. This registers
contain Elapsed Time Meter (ETM) data, which shows for how long the chip is
on, as well as an Offset Register data, which can be used to calibrate the
current measurement of the chip.

Signed-off-by: Luiz Sampaio 
---
 Documentation/w1/slaves/w1_ds2438.rst |  8 ++
 drivers/w1/slaves/w1_ds2438.c | 41 +++
 2 files changed, 49 insertions(+)

diff --git a/Documentation/w1/slaves/w1_ds2438.rst 
b/Documentation/w1/slaves/w1_ds2438.rst
index a29309a3f8e5..ac8d0d4b0d0e 100644
--- a/Documentation/w1/slaves/w1_ds2438.rst
+++ b/Documentation/w1/slaves/w1_ds2438.rst
@@ -44,6 +44,14 @@ Internally when this file is read, the additional CRC byte 
is also obtained
 from the slave device. If it is correct, the 8 bytes page data are passed
 to userspace, otherwise an I/O error is returned.
 
+"page1"
+---
+This file provides full 8 bytes of the chip Page 1 (01h).
+This page contains the ICA, elapsed time meter and current offset data of the 
DS2438.
+Internally when this file is read, the additional CRC byte is also obtained
+from the slave device. If it is correct, the 8 bytes page data are passed
+to userspace, otherwise an I/O error is returned.
+
 "temperature"
 -
 Opening and reading this file initiates the CONVERT_T (temperature conversion)
diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index ef6217ecb1cb..2cfdfedb584f 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -49,6 +49,15 @@
 #define DS2438_CURRENT_MSB 0x06
 #define DS2438_THRESHOLD   0x07
 
+/* Page #1 definitions */
+#define DS2438_ETM_0   0x00
+#define DS2438_ETM_1   0x01
+#define DS2438_ETM_2   0x02
+#define DS2438_ETM_3   0x03
+#define DS2438_ICA 0x04
+#define DS2438_OFFSET_LSB  0x05
+#define DS2438_OFFSET_MSB  0x06
+
 static int w1_ds2438_get_page(struct w1_slave *sl, int pageno, u8 *buf)
 {
unsigned int retries = W1_DS2438_RETRIES;
@@ -325,6 +334,36 @@ static ssize_t page0_read(struct file *filp, struct 
kobject *kobj,
return ret;
 }
 
+static ssize_t page1_read(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *bin_attr, char *buf,
+ loff_t off, size_t count)
+{
+   struct w1_slave *sl = kobj_to_w1_slave(kobj);
+   int ret;
+   u8 w1_buf[DS2438_PAGE_SIZE + 1 /*for CRC*/];
+
+   if (off != 0)
+   return 0;
+   if (!buf)
+   return -EINVAL;
+
+   mutex_lock(>master->bus_mutex);
+
+   /* Read no more than page1 size */
+   if (count > DS2438_PAGE_SIZE)
+   count = DS2438_PAGE_SIZE;
+
+   if (w1_ds2438_get_page(sl, 1, w1_buf) == 0) {
+   memcpy(buf, _buf, count);
+   ret = count;
+   } else
+   ret = -EIO;
+
+   mutex_unlock(>master->bus_mutex);
+
+   return ret;
+}
+
 static ssize_t temperature_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
loff_t off, size_t count)
@@ -390,6 +429,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
 
 static BIN_ATTR(iad, 0664, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
+static BIN_ATTR_RO(page1, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
 static BIN_ATTR_RO(vdd, 0/* real length varies */);
@@ -397,6 +437,7 @@ static BIN_ATTR_RO(vdd, 0/* real length varies */);
 static struct bin_attribute *w1_ds2438_bin_attrs[] = {
_attr_iad,
_attr_page0,
+   _attr_page1,
_attr_temperature,
_attr_vad,
_attr_vdd,
-- 
2.30.1



[PATCH v2 7/9] w1: ds2438: fixing bug that would always get page0

2021-04-02 Thread Luiz Sampaio
The purpose of the w1_ds2438_get_page function is to get the register
values at the page passed as the pageno parameter. However, the page0 was
hardcoded, such that the function always returned the page0 contents. Fixed
so that the function can retrieve any page.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index ccb06b8c2d78..ef6217ecb1cb 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -62,13 +62,13 @@ static int w1_ds2438_get_page(struct w1_slave *sl, int 
pageno, u8 *buf)
if (w1_reset_select_slave(sl))
continue;
w1_buf[0] = W1_DS2438_RECALL_MEMORY;
-   w1_buf[1] = 0x00;
+   w1_buf[1] = (u8)pageno;
w1_write_block(sl->master, w1_buf, 2);
 
if (w1_reset_select_slave(sl))
continue;
w1_buf[0] = W1_DS2438_READ_SCRATCH;
-   w1_buf[1] = 0x00;
+   w1_buf[1] = (u8)pageno;
w1_write_block(sl->master, w1_buf, 2);
 
count = w1_read_block(sl->master, buf, DS2438_PAGE_SIZE + 1);
-- 
2.30.1



[PATCH v2 6/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Changed the permissions to preferred octal style.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 56e53a748059..ccb06b8c2d78 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -388,7 +388,7 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
return ret;
 }
 
-static BIN_ATTR(iad, S_IRUGO | S_IWUSR | S_IWGRP, iad_read, iad_write, 0);
+static BIN_ATTR(iad, 0664, iad_write, 0);
 static BIN_ATTR_RO(page0, DS2438_PAGE_SIZE);
 static BIN_ATTR_RO(temperature, 0/* real length varies */);
 static BIN_ATTR_RO(vad, 0/* real length varies */);
-- 
2.30.1



[PATCH v2 2/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 148921fb9702..a5bb53042c93 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -287,9 +287,9 @@ static ssize_t iad_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_current(sl, ) == 0) {
+   if (w1_ds2438_get_current(sl, ) == 0)
ret = snprintf(buf, count, "%i\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v2 5/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index eeb593294fbd..56e53a748059 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -380,9 +380,9 @@ static ssize_t vdd_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VDD, ) == 0) {
+   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VDD, ) == 0)
ret = snprintf(buf, count, "%u\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v2 4/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index eca50cec304f..eeb593294fbd 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -359,9 +359,9 @@ static ssize_t vad_read(struct file *filp, struct kobject 
*kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VAD, ) == 0) {
+   if (w1_ds2438_get_voltage(sl, DS2438_ADC_INPUT_VAD, ) == 0)
ret = snprintf(buf, count, "%u\n", voltage);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v2 3/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
Since there is only one statement inside the if clause, no brackets
are required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index a5bb53042c93..eca50cec304f 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -338,9 +338,9 @@ static ssize_t temperature_read(struct file *filp, struct 
kobject *kobj,
if (!buf)
return -EINVAL;
 
-   if (w1_ds2438_get_temperature(sl, ) == 0) {
+   if (w1_ds2438_get_temperature(sl, ) == 0)
ret = snprintf(buf, count, "%i\n", temp);
-   } else
+   else
ret = -EIO;
 
return ret;
-- 
2.30.1



[PATCH v2 1/9] w1: ds2438: fixed a coding style issue

2021-04-02 Thread Luiz Sampaio
There is an if statement and, if the function goes into it, it returns.
So, the next else is not required.

Signed-off-by: Luiz Sampaio 
---
 drivers/w1/slaves/w1_ds2438.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/w1/slaves/w1_ds2438.c b/drivers/w1/slaves/w1_ds2438.c
index 5cfb0ae23e91..148921fb9702 100644
--- a/drivers/w1/slaves/w1_ds2438.c
+++ b/drivers/w1/slaves/w1_ds2438.c
@@ -154,11 +154,11 @@ static int w1_ds2438_change_config_bit(struct w1_slave 
*sl, u8 mask, u8 value)
 
if ((status & mask) == value)
return 0;   /* already set as requested */
-   else {
-   /* changing bit */
-   status ^= mask;
-   perform_write = 1;
-   }
+
+   /* changing bit */
+   status ^= mask;
+   perform_write = 1;
+
break;
}
 
-- 
2.30.1



[PATCH v2 0/9] w1: ds2438: adding support for calibration of current measurements

2021-04-02 Thread Luiz Sampaio
The following patches aim to make a user able to calibrate the current 
measurement of the DS2438. This chip uses a offset register in page1, which is 
added to the current register to give the user the current measurement. If this 
value is wrong, the user will get an offset current value, even if the current 
is zero, for instance. This patch gives support for reading the page1 registers 
(including the offset register) and for writing to the offset register. The 
DS2438 datasheet shows a calibration routine, and with this patch, the user can 
do this quickly by writing the correct value to the offset register. This patch 
was tested on real hardware using a power supply and an electronic load.
Please help to review this series of patches.

Best regards!
Sampaio
---
Changes in v2:
- Using git send-email to send the patches
- Adding documentation as requested
- Separating the coding style changes in different patches as requested

Luiz Sampaio (9):
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixed a coding style issue
  w1: ds2438: fixing bug that would always get page0
  w1: ds2438: adding support for reading page1
  w1: ds2438: support for writing to offset register

 Documentation/w1/slaves/w1_ds2438.rst |  19 +++-
 drivers/w1/slaves/w1_ds2438.c | 122 ++
 2 files changed, 124 insertions(+), 17 deletions(-)

-- 
2.30.1



Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg

2021-04-02 Thread Roman Gushchin
On Fri, Apr 02, 2021 at 06:04:54PM -0700, Andrew Morton wrote:
> On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin  wrote:
> 
> > On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > > On 2021/4/1 11:01, Muchun Song wrote:
> > > > Christian Borntraeger reported a warning about "percpu ref
> > > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > > Because we forgot to obtain the reference to the objcg and
> > > > wrongly obtain the reference of memcg.
> > > > 
> > > > Reported-by: Christian Borntraeger 
> > > > Signed-off-by: Muchun Song 
> > > 
> > > Thanks for the patch.
> > > Is a Fixes tag needed?
> > 
> > No, as the original patch hasn't been merged into the Linus's tree yet.
> > So the fix can be simply squashed.
> 
> Help.  Which is "the original patch"?

"mm: memcontrol: use obj_cgroup APIs to charge kmem pages"


Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg

2021-04-02 Thread Shakeel Butt
On Fri, Apr 2, 2021 at 6:04 PM Andrew Morton  wrote:
>
> On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin  wrote:
>
> > On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > > On 2021/4/1 11:01, Muchun Song wrote:
> > > > Christian Borntraeger reported a warning about "percpu ref
> > > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > > Because we forgot to obtain the reference to the objcg and
> > > > wrongly obtain the reference of memcg.
> > > >
> > > > Reported-by: Christian Borntraeger 
> > > > Signed-off-by: Muchun Song 
> > >
> > > Thanks for the patch.
> > > Is a Fixes tag needed?
> >
> > No, as the original patch hasn't been merged into the Linus's tree yet.
> > So the fix can be simply squashed.
>
> Help.  Which is "the original patch"?

"mm: memcontrol: use obj_cgroup APIs to charge kmem pages"


Re: [PATCH] mm: memcontrol: fix forget to obtain the ref to objcg in split_page_memcg

2021-04-02 Thread Andrew Morton
On Wed, 31 Mar 2021 20:35:02 -0700 Roman Gushchin  wrote:

> On Thu, Apr 01, 2021 at 11:31:16AM +0800, Miaohe Lin wrote:
> > On 2021/4/1 11:01, Muchun Song wrote:
> > > Christian Borntraeger reported a warning about "percpu ref
> > > (obj_cgroup_release) <= 0 (-1) after switching to atomic".
> > > Because we forgot to obtain the reference to the objcg and
> > > wrongly obtain the reference of memcg.
> > > 
> > > Reported-by: Christian Borntraeger 
> > > Signed-off-by: Muchun Song 
> > 
> > Thanks for the patch.
> > Is a Fixes tag needed?
> 
> No, as the original patch hasn't been merged into the Linus's tree yet.
> So the fix can be simply squashed.

Help.  Which is "the original patch"?

> Btw, the fix looks good to me.
> 
> Acked-by: Roman Gushchin 



Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-04-02 Thread Fenghua Yu
Hi, Thomas,

On Sat, Mar 20, 2021 at 01:42:52PM +0100, Thomas Gleixner wrote:
> On Fri, Mar 19 2021 at 22:19, Fenghua Yu wrote:
> > On Fri, Mar 19, 2021 at 10:30:50PM +0100, Thomas Gleixner wrote:
> >> > +if (sscanf(arg, "ratelimit:%d", ) == 1 && ratelimit > 
> >> > 0) {
> >> > +bld_ratelimit = ratelimit;
> >> 
> >> So any rate up to INTMAX/s is valid here, right?
> >
> > Yes. I don't see smaller limitation than INTMX/s. Is that right?
> 
> That's a given, but what's the point of limits in that range?
> 
> A buslock access locks up the system for X cycles. So the total amount
> of allowable damage in cycles per second is:
> 
>limit * stall_cycles_per_bus_lock
> 
> ergo the time (in seconds) which the system is locked up is:
> 
>limit * stall_cycles_per_bus_lock / cpufreq
> 
> Which means for ~INTMAX/2 on a 2 GHz CPU:
> 
>2 * 10^9 * $CYCLES  / 2 * 10^9  = $CYCLES seconds
> 
> Assumed the inflicted damage is only 1 cycle then #LOCK is pretty much
> permanently on if there are enough threads. Sure #DB will slow them
> down, but it still does not make any sense at all especially as the
> damage is certainly greater than a single cycle.
> 
> And because the changelogs and the docs are void of numbers I just got
> real numbers myself.
> 
> With a single thread doing a 'lock inc *mem' accross a cache line
> boundary the workload which I measured with perf stat goes from:
> 
>  5,940,985,091  instructions  #0.88  insn per cycle   
>   
>2.780950806 seconds time elapsed
>0.99848 seconds user
>4.202137000 seconds sys
> to
> 
>  7,467,979,504  instructions  #0.10  insn per cycle   
>   
>5.110795917 seconds time elapsed
>7.123499000 seconds user
>   37.266852000 seconds sys
> 
> The buslock injection rate is ~250k per second.
> 
> Even if I ratelimit the locked inc by a delay loop of ~5000 cycles
> which is probably more than what the #DB will cost then this single task
> still impacts the workload significantly:
> 
>  6,496,994,537  instructions  #0.39  insn per cycle   
>   
>3.043275473 seconds time elapsed
>1.899852000 seconds user
>8.957088000 seconds sys
> 
> The buslock injection rate is down to ~150k per second in this case.
> 
> And even with throttling the injection rate further down to 25k per
> second the impact on the workload is still significant in the 10% range.

Thank you for your insight!

So I can change the ratelimit to system wide and call usleep_range()
to sleep: 
   while (!__ratelimit(_bld_ratelimit))
   usleep_range(100 / bld_ratelimit,
100 / bld_ratelimit);

The max bld_ratelimit is 1000,000/s because the max sleeping time is 1 usec.
The min bld_ratelimit is 1/s.

> 
> And of course the documentation of the ratelimit parameter explains all
> of this in great detail so the administrator has a trivial job to tune
> that, right?

I will explain how to tune the parameter in buslock.rst doc.

> 
> >> > +case sld_ratelimit:
> >> > +/* Enforce no more than bld_ratelimit bus locks/sec. */
> >> > +while (!__ratelimit(_current_user()->bld_ratelimit))
> >> > +msleep(1000 / bld_ratelimit);
> 
> For any ratelimit > 1000 this will loop up to 1000 times with
> CONFIG_HZ=1000.
> 
> Assume that the buslock producer has tons of threads which all end up
> here pretty soon then you launch a mass wakeup in the worst case every
> jiffy. Are you sure that the cure is better than the disease?

if using usleep_range() to sleep, the threads will not sleep and wakeup,
right? Maybe I can use msleep() for msec (bigger bld_ratelimit) and
usleep_range() for usec (smaller bld_ratelimit)?

Even if there is mass wakeup, throttling the threads can avoid the system
wide performance degradation (e.g. 7x slower dd command in another user).
Is that a good justification for throttling the threads?

> 
> > If I split this whole patch set into two patch sets:
> > 1. Three patches in the first patch set: the enumeration patch, the warn
> >and fatal patch, and the documentation patch.
> > 2. Two patches in the second patch set: the ratelimit patch and the
> >documentation patch.
> >
> > Then I will send the two patch sets separately, you will accept them one
> > by one. Is that OK?
> 
> That's obviously the right thing to do because #1 should be ready and we
> can sort out #2 seperately. See the conversation with Tony.

Thank you for picking up the first patch set!

-Fenghua


Re: [PATCH v5 2/3] x86/bus_lock: Handle #DB for bus lock

2021-04-02 Thread Fenghua Yu
Hi, Thomas,

On Sat, Mar 20, 2021 at 02:57:52PM +0100, Thomas Gleixner wrote:
> On Sat, Mar 20 2021 at 02:01, Thomas Gleixner wrote:
> 
> > On Fri, Mar 19 2021 at 21:50, Tony Luck wrote:
> >>>  What is the justifucation for making this rate limit per UID and not
> >>>  per task, per process or systemwide?
> >>
> >> The concern is that a malicious user is running a workload that loops
> >> obtaining the buslock. This brings the whole system to its knees.
> >>
> >> Limiting per task doesn't help. The user can just fork(2) a whole bunch
> >> of tasks for a distributed buslock attack..
> >
> > Fair enough.
> >
> >> Systemwide might be an interesting alternative. Downside would be 
> >> accidental
> >> rate limit of non-malicious tasks that happen to grab a bus lock 
> >> periodically
> >> but in the same window with other buslocks from other users.
> >>
> >> Do you think that a risk worth taking to make the code simpler?
> >
> > I'd consider it low risk, but I just looked for the usage of the
> > existing ratelimit in struct user and the related commit. Nw it's dawns
> > on me where you are coming from.
> 
> So after getting real numbers myself, I have more thoughts on
> this. Setting a reasonable per user limit might be hard when you want to
> protect e.g. against an orchestrated effort by several users
> (containers...). If each of them stays under the limit which is easy
> enough to figure out then you still end up with significant accumulated
> damage.
> 
> So systemwide might not be the worst option after all.

Indeed.

> 
> The question is how wide spread are bus locks in existing applications?
> I haven't found any on a dozen machines with random variants of
> workloads so far according to perf ... -e sq_misc.split_lock.

We have been running various tests widely inside Intel (and also outside)
after enabling split lock and captured a few split lock issues in firmware,
kernel, drivers, and apps. As you know, we have submitted a few patches to
fix the split lock issues in the kernel and drivers (e.g. split lock
in bit ops) and fixed a few split lock issues in firmware.

But so far I'm not aware of any split lock issues in user space yet.
I guess compilers do good cache line alignment good job to avoid this
issue. But inline asm in user apps can easily hit this issue (on purpose).

> 
> What's the actual scenario in the real world where a buslock access
> might be legitimate?

I did a simple experiment: looping on a split locked instruction on
one core in one user can slow down "dd" command running on another core
in another user by 7 times. A malicious user can do similar things to
slow down the whole system performance, right?

> 
> And what's the advice, recommendation for a system administrator how to
> analyze the situation and what kind of parameter to set?
> 
> I tried to get answers from Documentation/x86/buslock.rst, but 

Can I change the sleep code in the handle_bus_lock() to the following?

   while (!__ratelimit(_bld_ratelimit))
   usleep_range(100 / bld_ratelimit,
100 / bld_ratelimit);

Maybe the system wide bus lock ratelimit can be set to default value
1000,000/s which is also the max ratelimit value.

The max sleep in the kernel is 1 us which means max bld_ratelimit
can be up to 1000,000.

If the system administrator think bus locks are less tolerant and wants
to throttle bus lock further, bld_ratelimit can be set as a smaller number.
The smallest bld_ratelimit is 1.

When I gradually decreases bld_ratelimit value, I can see less bus locks
can be issued per second systemwide and "dd" command or other memory
benchmarks are less impacted by the bus locks.

If this works, I will have the buslock.rst doc to explain the situation
and how to set the parameter.

Thanks.

-Fenghua


Re: [PATCH 0/6] Allow signals for IO threads

2021-04-02 Thread Stefan Metzmacher
Am 01.04.21 um 18:24 schrieb Linus Torvalds:
> On Thu, Apr 1, 2021 at 9:00 AM Stefan Metzmacher  wrote:
>>
>> I haven't tried it, but it seems gdb tries to use PTRACE_PEEKUSR
>> against the last thread tid listed under /proc//tasks/ in order to
>> get the architecture for the userspace application
> 
> Christ, what an odd hack. Why wouldn't it just do it on the initial
> thread you actually attached to?
> 
> Are you sure it's not simply because your test-case was to attach to
> the io_uring thread? Because the io_uring thread might as well be
> considered to be 64-bit.

  │   └─io_uring-cp,1396 Makefile file
  │   ├─{iou-mgr-1396},1397
  │   ├─{iou-wrk-1396},1398
  │   └─{iou-wrk-1396},1399

strace -ttT -o strace-uring-fail.txt gdb --pid 1396
(note strace -f would deadlock gdb with SIGSTOP)

The full file can be found here:
https://www.samba.org/~metze/strace-uring-fail.txt
(I guess there was a race and the workers 1398 and 1399 exited in between,
that's it using 1397):

18:46:35.429498 ptrace(PTRACE_PEEKUSER, 1397, 8*CS, [NULL]) = 0 <0.22>

>> so my naive assumption
>> would be that it wouldn't allow the detection of a 32-bit application
>> using a 64-bit kernel.
> 
> I'm not entirely convinced we want to care about a confused gdb
> implementation and somebody debugging a case that I don't believe
> happens in practice.
> 
> 32-bit user space is legacy. And legacy isn't io_uring. If somebody
> insists on doing odd things, they can live with the odd results.

Ok, I'd agree for 32-bit applications, but what about libraries?
E.g. distributions ship libraries like libsmbclient or nss modules
as 64-bit and 32-bit version, in order to support legacy applications
to run. Why shouldn't the 32-bit library builds not support io_uring?
(Note libsmbclient don't use io_uring yet, but I guess it will be in future).

Any ideas regarding similar problems for other architectures?

metze




Re: [RFC PATCH] mm/swap: fix system stuck due to infinite loop

2021-04-02 Thread Andrew Morton
On Fri, 2 Apr 2021 15:03:37 +0800 Stillinux  wrote:

> In the case of high system memory and load pressure, we ran ltp test
> and found that the system was stuck, the direct memory reclaim was
> all stuck in io_schedule, the waiting request was stuck in the blk_plug
> flow of one process, and this process fell into an infinite loop.
> not do the action of brushing out the request.
> 
> The call flow of this process is swap_cluster_readahead.
> Use blk_start/finish_plug for blk_plug operation,
> flow swap_cluster_readahead->__read_swap_cache_async->swapcache_prepare.
> When swapcache_prepare return -EEXIST, it will fall into an infinite loop,
> even if cond_resched is called, but according to the schedule,
> sched_submit_work will be based on tsk->state, and will not flash out
> the blk_plug request, so will hang io, causing the overall system  hang.
> 
> For the first time involving the swap part, there is no good way to fix
> the problem from the fundamental problem. In order to solve the
> engineering situation, we chose to make swap_cluster_readahead aware of
> the memory pressure situation as soon as possible, and do io_schedule to
> flush out the blk_plug request, thereby changing the allocation flag in
> swap_readpage to GFP_NOIO , No longer do the memory reclaim of flush io.
> Although system operating normally, but not the most fundamental way.
> 

Thanks.

I'm not understanding why swapcache_prepare() repeatedly returns
-EEXIST in this situation?

And how does the switch to GFP_NOIO fix this?  Simply by avoiding
direct reclaim altogether?

> ---
>  mm/page_io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_io.c b/mm/page_io.c
> index c493ce9ebcf5..87392ffabb12 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -403,7 +403,7 @@ int swap_readpage(struct page *page, bool synchronous)
>   }
> 
>   ret = 0;
> - bio = bio_alloc(GFP_KERNEL, 1);
> + bio = bio_alloc(GFP_NOIO, 1);
>   bio_set_dev(bio, sis->bdev);
>   bio->bi_opf = REQ_OP_READ;
>   bio->bi_iter.bi_sector = swap_page_sector(page);



[GIT PULL] SCSI fixes for 5.12-rc

2021-04-02 Thread James Bottomley
Single fix to iscsi for a rare race condition which can cause a kernel
panic.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Gulam Mohamed (1):
  scsi: iscsi: Fix race condition between login and sync thread

And the diffstat:

 drivers/scsi/scsi_transport_iscsi.c | 14 +-
 include/scsi/scsi_transport_iscsi.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

With full diff below.

James

---

diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 969d24d580e2..bebfb355abdf 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2471,6 +2471,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn 
*conn, int flag)
 */
mutex_lock(_mutex);
conn->transport->stop_conn(conn, flag);
+   conn->state = ISCSI_CONN_DOWN;
mutex_unlock(_mutex);
 
 }
@@ -2894,6 +2895,13 @@ iscsi_set_param(struct iscsi_transport *transport, 
struct iscsi_uevent *ev)
default:
err = transport->set_param(conn, ev->u.set_param.param,
   data, ev->u.set_param.len);
+   if ((conn->state == ISCSI_CONN_BOUND) ||
+   (conn->state == ISCSI_CONN_UP)) {
+   err = transport->set_param(conn, ev->u.set_param.param,
+   data, ev->u.set_param.len);
+   } else {
+   return -ENOTCONN;
+   }
}
 
return err;
@@ -2953,6 +2961,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport 
*transport,
mutex_lock(>ep_mutex);
conn->ep = NULL;
mutex_unlock(>ep_mutex);
+   conn->state = ISCSI_CONN_DOWN;
}
 
transport->ep_disconnect(ep);
@@ -3713,6 +3722,8 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr 
*nlh, uint32_t *group)
ev->r.retcode = transport->bind_conn(session, conn,
ev->u.b_conn.transport_eph,
ev->u.b_conn.is_leading);
+   if (!ev->r.retcode)
+   conn->state = ISCSI_CONN_BOUND;
mutex_unlock(_mutex);
 
if (ev->r.retcode || !transport->ep_connect)
@@ -3944,7 +3955,8 @@ iscsi_conn_attr(local_ipaddr, ISCSI_PARAM_LOCAL_IPADDR);
 static const char *const connection_state_names[] = {
[ISCSI_CONN_UP] = "up",
[ISCSI_CONN_DOWN] = "down",
-   [ISCSI_CONN_FAILED] = "failed"
+   [ISCSI_CONN_FAILED] = "failed",
+   [ISCSI_CONN_BOUND] = "bound"
 };
 
 static ssize_t show_conn_state(struct device *dev,
diff --git a/include/scsi/scsi_transport_iscsi.h 
b/include/scsi/scsi_transport_iscsi.h
index 8a26a2ffa952..fc5a39839b4b 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -193,6 +193,7 @@ enum iscsi_connection_state {
ISCSI_CONN_UP = 0,
ISCSI_CONN_DOWN,
ISCSI_CONN_FAILED,
+   ISCSI_CONN_BOUND,
 };
 
 struct iscsi_cls_conn {



Re: [PATCH v6 00/27] Memory Folios

2021-04-02 Thread Kent Overstreet
On Wed, Mar 31, 2021 at 07:47:01PM +0100, Matthew Wilcox (Oracle) wrote:
> The medium-term goal is to convert all filesystems and some device
> drivers to work in terms of folios.  This series contains a lot of
> explicit conversions, but it's important to realise it's removing a lot
> of implicit conversions in some relatively hot paths.  There will be very
> few conversions from folios when this work is completed; filesystems,
> the page cache, the LRU and so on will generally only deal with folios.

I'm pretty excited for this to land - 4k page overhead has been a pain point for
me for quite some time. I know this is going to be a lot of churn but I think
leveraging the type system is exactly the right way to go about this, and I
can't wait to start converting bcachefs.


[PATCH 2/2] perf-stat: introduce config stat.bpf-counter-events

2021-04-02 Thread Song Liu
Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. This is limited to hardware events in
evsel__hw_names.

This also enables mixed BPF event and regular event in the same sesssion.
For example:

   perf config stat.bpf-counter-events=instructions
   perf stat -e instructions,cs

The second command will use BPF for "instructions" but not "cs".

Signed-off-by: Song Liu 
---
 tools/perf/Documentation/perf-stat.txt |  2 ++
 tools/perf/builtin-stat.c  | 41 --
 tools/perf/util/bpf_counter.c  | 11 +++
 tools/perf/util/config.c   | 32 
 tools/perf/util/evsel.c|  2 ++
 tools/perf/util/evsel.h|  1 +
 tools/perf/util/target.h   |  5 
 7 files changed, 74 insertions(+), 20 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt 
b/tools/perf/Documentation/perf-stat.txt
index 744211fa8c186..6d4733eaac170 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -97,6 +97,8 @@ report::
Use BPF programs to aggregate readings from perf_events.  This
allows multiple perf-stat sessions that are counting the same metric 
(cycles,
instructions, etc.) to share hardware counters.
+   To use BPF programs on common hardware events by default, use
+   "perf config stat.bpf-counter-events=".
 
 --bpf-attr-map::
With option "--bpf-counters", different perf-stat sessions share
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4bb48c6b66980..5adfa708ffe68 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -423,17 +423,28 @@ static int read_affinity_counters(struct timespec *rs)
return 0;
 }
 
+/*
+ * Returns:
+ * 0   if all events use BPF;
+ * 1   if some events do NOT use BPF;
+ * < 0 on errors;
+ */
 static int read_bpf_map_counters(void)
 {
+   bool has_none_bpf_events = false;
struct evsel *counter;
int err;
 
evlist__for_each_entry(evsel_list, counter) {
+   if (!counter->bpf_counter_ops) {
+   has_none_bpf_events = true;
+   continue;
+   }
err = bpf_counter__read(counter);
if (err)
return err;
}
-   return 0;
+   return has_none_bpf_events ? 1 : 0;
 }
 
 static void read_counters(struct timespec *rs)
@@ -442,9 +453,10 @@ static void read_counters(struct timespec *rs)
int err;
 
if (!stat_config.stop_read_counter) {
-   if (target__has_bpf())
-   err = read_bpf_map_counters();
-   else
+   err = read_bpf_map_counters();
+   if (err < 0)
+   return;
+   if (err)
err = read_affinity_counters(rs);
if (err < 0)
return;
@@ -535,12 +547,13 @@ static int enable_counters(void)
struct evsel *evsel;
int err;
 
-   if (target__has_bpf()) {
-   evlist__for_each_entry(evsel_list, evsel) {
-   err = bpf_counter__enable(evsel);
-   if (err)
-   return err;
-   }
+   evlist__for_each_entry(evsel_list, evsel) {
+   if (!evsel->bpf_counter_ops)
+   continue;
+
+   err = bpf_counter__enable(evsel);
+   if (err)
+   return err;
}
 
if (stat_config.initial_delay < 0) {
@@ -784,11 +797,9 @@ static int __run_perf_stat(int argc, const char **argv, 
int run_idx)
if (affinity__setup() < 0)
return -1;
 
-   if (target__has_bpf()) {
-   evlist__for_each_entry(evsel_list, counter) {
-   if (bpf_counter__load(counter, ))
-   return -1;
-   }
+   evlist__for_each_entry(evsel_list, counter) {
+   if (bpf_counter__load(counter, ))
+   return -1;
}
 
evlist__for_each_cpu (evsel_list, i, cpu) {
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index df70c8dcf7850..ea45914af1693 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -792,6 +792,17 @@ int bpf_counter__load(struct evsel *evsel, struct target 
*target)
evsel->bpf_counter_ops = _program_profiler_ops;
else if (target->use_bpf)
evsel->bpf_counter_ops = _ops;
+   else {
+   int i;
+
+   for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+   if (!strcmp(evsel->name, evsel__hw_names[i])) {
+   if (evsel__use_bpf_counters[i])
+

[PATCH 1/2] perf util: move bperf definitions to a libperf header

2021-04-02 Thread Song Liu
By following the same protocol, other tools can share hardware PMCs with
perf. Move perf_event_attr_map_entry and BPERF_DEFAULT_ATTR_MAP_PATH to
bperf.h for other tools to use.

Also add bperf_attr_map_compatible() to check whether existing attr_map
is compatible with current perf binary.

Signed-off-by: Song Liu 
---
 tools/lib/perf/include/perf/bperf.h | 29 +++
 tools/perf/util/bpf_counter.c   | 44 ++---
 2 files changed, 50 insertions(+), 23 deletions(-)
 create mode 100644 tools/lib/perf/include/perf/bperf.h

diff --git a/tools/lib/perf/include/perf/bperf.h 
b/tools/lib/perf/include/perf/bperf.h
new file mode 100644
index 0..02b2fd5e50c75
--- /dev/null
+++ b/tools/lib/perf/include/perf/bperf.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+#ifndef __LIBPERF_BPERF_H
+#define __LIBPERF_BPERF_H
+
+/*
+ * bperf uses a hashmap, the attr_map, to track all the leader programs.
+ * The hashmap is pinned in bpffs. flock() on this file is used to ensure
+ * no concurrent access to the attr_map.  The key of attr_map is struct
+ * perf_event_attr, and the value is struct perf_event_attr_map_entry.
+ *
+ * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
+ * leader prog, and the diff_map. Each perf-stat session holds a reference
+ * to the bpf_link to make sure the leader prog is attached to sched_switch
+ * tracepoint.
+ *
+ * Since the hashmap only contains IDs of the bpf_link and diff_map, it
+ * does not hold any references to the leader program. Once all perf-stat
+ * sessions of these events exit, the leader prog, its maps, and the
+ * perf_events will be freed.
+ */
+struct perf_event_attr_map_entry {
+   __u32 link_id;
+   __u32 diff_map_id;
+};
+
+/* pin the map at sysfs__mountpoint()/BPERF_DEFAULT_ATTR_MAP_PATH */
+#define BPERF_DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
+
+#endif /* __LIBPERF_BPERF_H */
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 81d1df3c4ec0e..df70c8dcf7850 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "bpf_counter.h"
 #include "counts.h"
@@ -29,28 +30,6 @@
 #include "bpf_skel/bperf_leader.skel.h"
 #include "bpf_skel/bperf_follower.skel.h"
 
-/*
- * bperf uses a hashmap, the attr_map, to track all the leader programs.
- * The hashmap is pinned in bpffs. flock() on this file is used to ensure
- * no concurrent access to the attr_map.  The key of attr_map is struct
- * perf_event_attr, and the value is struct perf_event_attr_map_entry.
- *
- * struct perf_event_attr_map_entry contains two __u32 IDs, bpf_link of the
- * leader prog, and the diff_map. Each perf-stat session holds a reference
- * to the bpf_link to make sure the leader prog is attached to sched_switch
- * tracepoint.
- *
- * Since the hashmap only contains IDs of the bpf_link and diff_map, it
- * does not hold any references to the leader program. Once all perf-stat
- * sessions of these events exit, the leader prog, its maps, and the
- * perf_events will be freed.
- */
-struct perf_event_attr_map_entry {
-   __u32 link_id;
-   __u32 diff_map_id;
-};
-
-#define DEFAULT_ATTR_MAP_PATH "fs/bpf/perf_attr_map"
 #define ATTR_MAP_SIZE 16
 
 static inline void *u64_to_ptr(__u64 ptr)
@@ -333,6 +312,20 @@ static __u32 bpf_map_get_id(int fd)
return map_info.id;
 }
 
+static bool bperf_attr_map_compatible(int attr_map_fd)
+{
+   struct bpf_map_info map_info = {0};
+   __u32 map_info_len = sizeof(map_info);
+   int err;
+
+   err = bpf_obj_get_info_by_fd(attr_map_fd, _info, _info_len);
+
+   if (err)
+   return false;
+   return (map_info.key_size == sizeof(struct perf_event_attr)) &&
+   (map_info.value_size == sizeof(struct 
perf_event_attr_map_entry));
+}
+
 static int bperf_lock_attr_map(struct target *target)
 {
char path[PATH_MAX];
@@ -342,7 +335,7 @@ static int bperf_lock_attr_map(struct target *target)
scnprintf(path, PATH_MAX, "%s", target->attr_map);
} else {
scnprintf(path, PATH_MAX, "%s/%s", sysfs__mountpoint(),
- DEFAULT_ATTR_MAP_PATH);
+ BPERF_DEFAULT_ATTR_MAP_PATH);
}
 
if (access(path, F_OK)) {
@@ -367,6 +360,11 @@ static int bperf_lock_attr_map(struct target *target)
return -1;
}
 
+   if (!bperf_attr_map_compatible(map_fd)) {
+   close(map_fd);
+   return -1;
+
+   }
err = flock(map_fd, LOCK_EX);
if (err) {
close(map_fd);
-- 
2.30.2



Re: [PATCH -next v2] libbpf: remove redundant semi-colon

2021-04-02 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Fri, 2 Apr 2021 09:26:34 +0800 you wrote:
> Remove redundant semi-colon in infinalize_btf_ext().
> 
> Signed-off-by: Yang Yingliang 
> ---
> v2:
>   add commit log
> 
> [...]

Here is the summary with links:
  - [-next,v2] libbpf: remove redundant semi-colon
https://git.kernel.org/bpf/bpf-next/c/f07669df4c8d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH] usb: cdnsp: Fixes issue with Configure Endpoint command

2021-04-02 Thread Peter Chen
On 21-03-30 04:26:10, Pawel Laszczak wrote:
> Hi Peter,
> 
> >
> >On 21-03-22 07:09:02, Pawel Laszczak wrote:
> >> From: Pawel Laszczak 
> >>
> >> Patch adds flag EP_UNCONFIGURED to detect whether endpoint was
> >> unconfigured. This flag is set in cdnsp_reset_device after Reset Device
> >> command. Among others this command disables all non control endpoints.
> >> Flag is used in cdnsp_gadget_ep_disable to protect controller against
> >> invoking Configure Endpoint command on disabled endpoint. Lack of this
> >> protection in some cases caused that Configure Endpoint command completed
> >> with Context State Error code completion.
> >>
> >> Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP 
> >> DRD Driver")
> >> Signed-off-by: Pawel Laszczak 
> >> ---
> >>  drivers/usb/cdns3/cdnsp-gadget.c | 18 +-
> >>  drivers/usb/cdns3/cdnsp-gadget.h | 11 ++-
> >>  2 files changed, 19 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/drivers/usb/cdns3/cdnsp-gadget.c 
> >> b/drivers/usb/cdns3/cdnsp-gadget.c
> >> index d7d4bdd57f46..de17cc4ad91a 100644
> >> --- a/drivers/usb/cdns3/cdnsp-gadget.c
> >> +++ b/drivers/usb/cdns3/cdnsp-gadget.c
> >> @@ -727,7 +727,7 @@ int cdnsp_reset_device(struct cdnsp_device *pdev)
> >> * are in Disabled state.
> >> */
> >>for (i = 1; i < CDNSP_ENDPOINTS_NUM; ++i)
> >> -  pdev->eps[i].ep_state |= EP_STOPPED;
> >> +  pdev->eps[i].ep_state |= EP_STOPPED | EP_UNCONFIGURED;
> >>
> >>trace_cdnsp_handle_cmd_reset_dev(slot_ctx);
> >>
> >> @@ -942,6 +942,7 @@ static int cdnsp_gadget_ep_enable(struct usb_ep *ep,
> >>
> >>pep = to_cdnsp_ep(ep);
> >>pdev = pep->pdev;
> >> +  pep->ep_state &= ~EP_UNCONFIGURED;
> >>
> >>if (dev_WARN_ONCE(pdev->dev, pep->ep_state & EP_ENABLED,
> >>  "%s is already enabled\n", pep->name))
> >> @@ -1023,9 +1024,13 @@ static int cdnsp_gadget_ep_disable(struct usb_ep 
> >> *ep)
> >>goto finish;
> >>}
> >>
> >> -  cdnsp_cmd_stop_ep(pdev, pep);
> >>pep->ep_state |= EP_DIS_IN_RROGRESS;
> >> -  cdnsp_cmd_flush_ep(pdev, pep);
> >> +
> >> +  /* Endpoint was unconfigured by Reset Device command. */
> >> +  if (!(pep->ep_state & EP_UNCONFIGURED)) {
> >> +  cdnsp_cmd_stop_ep(pdev, pep);
> >> +  cdnsp_cmd_flush_ep(pdev, pep);
> >> +  }
> >>
> >>/* Remove all queued USB requests. */
> >>while (!list_empty(>pending_list)) {
> >> @@ -1036,6 +1041,7 @@ static int cdnsp_gadget_ep_disable(struct usb_ep *ep)
> >>cdnsp_invalidate_ep_events(pdev, pep);
> >>
> >>pep->ep_state &= ~EP_DIS_IN_RROGRESS;
> >> +
> >
> >Useless blank line
> >
> >>drop_flag = cdnsp_get_endpoint_flag(pep->endpoint.desc);
> >>ctrl_ctx = cdnsp_get_input_control_ctx(>in_ctx);
> >>ctrl_ctx->drop_flags = cpu_to_le32(drop_flag);
> >> @@ -1043,10 +1049,12 @@ static int cdnsp_gadget_ep_disable(struct usb_ep 
> >> *ep)
> >>
> >>cdnsp_endpoint_zero(pdev, pep);
> >>
> >> -  ret = cdnsp_update_eps_configuration(pdev, pep);
> >> +  if (!(pep->ep_state & EP_UNCONFIGURED))
> >> +  ret = cdnsp_update_eps_configuration(pdev, pep);
> >> +
> >>cdnsp_free_endpoint_rings(pdev, pep);
> >>
> >> -  pep->ep_state &= ~EP_ENABLED;
> >> +  pep->ep_state &= ~(EP_ENABLED | EP_UNCONFIGURED);
> >>pep->ep_state |= EP_STOPPED;
> >>
> >>  finish:
> >> diff --git a/drivers/usb/cdns3/cdnsp-gadget.h 
> >> b/drivers/usb/cdns3/cdnsp-gadget.h
> >> index 6bbb26548c04..e628bd539e23 100644
> >> --- a/drivers/usb/cdns3/cdnsp-gadget.h
> >> +++ b/drivers/usb/cdns3/cdnsp-gadget.h
> >> @@ -830,11 +830,12 @@ struct cdnsp_ep {
> >>unsigned int ep_state;
> >>  #define EP_ENABLEDBIT(0)
> >>  #define EP_DIS_IN_RROGRESSBIT(1)
> >> -#define EP_HALTED BIT(2)
> >> -#define EP_STOPPEDBIT(3)
> >> -#define EP_WEDGE  BIT(4)
> >> -#define EP0_HALTED_STATUS BIT(5)
> >> -#define EP_HAS_STREAMSBIT(6)
> >> +#define EP_UNCONFIGURED   BIT(2)
> >
> >Why add new flag as BIT(2), it causes many changes in this patch?
> 
> In my feeling, EP_UNCONFIGURED is more associates with the first 2 flags, so 
> I've decided 
> put it after BIT(1).  

No, you may not add such relationship, each flag has its own meaning,
otherwise, the sequence of flag bitmap may be changed again.

Peter

> 
> >
> >> +#define EP_HALTED BIT(3)
> >> +#define EP_STOPPEDBIT(4)
> >> +#define EP_WEDGE  BIT(5)
> >> +#define EP0_HALTED_STATUS BIT(6)
> >> +#define EP_HAS_STREAMSBIT(7)
> >>
> >>bool skip;
> >>  };
> >> --
> >> 2.25.1
> >>
> >
> >--
> >
> >Thanks,
> >Peter Chen
> 

-- 

Thanks,
Peter Chen



Re: [PATCH] linux/bpf-cgroup.h: Delete repeated struct declaration

2021-04-02 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Thu,  1 Apr 2021 14:46:37 +0800 you wrote:
> struct bpf_prog is declared twice. There is one declaration
> which is independent on the MACRO at 18th line.
> So the below one is not needed though. Remove the duplicate.
> 
> Signed-off-by: Wan Jiabing 
> ---
>  include/linux/bpf-cgroup.h | 1 -
>  1 file changed, 1 deletion(-)

Here is the summary with links:
  - linux/bpf-cgroup.h: Delete repeated struct declaration
https://git.kernel.org/bpf/bpf-next/c/2daae89666ad

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [Patch bpf-next] bpf: remove unused parameter from ___bpf_prog_run

2021-04-02 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Wed, 31 Mar 2021 07:51:35 + you wrote:
> 'stack' parameter is not used in ___bpf_prog_run,
> the base address have been set to FP reg. So consequently remove it.
> 
> Signed-off-by: He Fengqing 
> ---
>  kernel/bpf/core.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Here is the summary with links:
  - [bpf-next] bpf: remove unused parameter from ___bpf_prog_run
https://git.kernel.org/bpf/bpf-next/c/2ec9898e9c70

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH] linux/bpf.h: Remove repeated struct declaration

2021-04-02 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Thu,  1 Apr 2021 15:20:37 +0800 you wrote:
> struct btf_type is declared twice. One is declared at 35th line.
> The blew one is not needed. Remove the duplicate.
> 
> Signed-off-by: Wan Jiabing 
> ---
>  include/linux/bpf.h | 1 -
>  1 file changed, 1 deletion(-)

Here is the summary with links:
  - linux/bpf.h: Remove repeated struct declaration
https://git.kernel.org/bpf/bpf-next/c/6ac4c6f887f5

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




[PATCH 5/5] KVM: SVM: Allocate SEV command structures on local stack

2021-04-02 Thread Sean Christopherson
Use the local stack to "allocate" the structures used to communicate with
the PSP.  The largest struct used by KVM, sev_data_launch_secret, clocks
in at 52 bytes, well within the realm of reasonable stack usage.  The
smallest structs are a mere 4 bytes, i.e. the pointer for the allocation
is larger than the allocation itself.

Now that the PSP driver plays nice with vmalloc pointers, putting the
data on a virtually mapped stack (CONFIG_VMAP_STACK=y) will not cause
explosions.

Cc: Brijesh Singh 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/svm/sev.c | 262 +++--
 1 file changed, 96 insertions(+), 166 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5457138c7347..316fd39c7aef 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -150,35 +150,22 @@ static void sev_asid_free(int asid)
 
 static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
 {
-   struct sev_data_decommission *decommission;
-   struct sev_data_deactivate *data;
+   struct sev_data_decommission decommission;
+   struct sev_data_deactivate deactivate;
 
if (!handle)
return;
 
-   data = kzalloc(sizeof(*data), GFP_KERNEL);
-   if (!data)
-   return;
-
-   /* deactivate handle */
-   data->handle = handle;
+   deactivate.handle = handle;
 
/* Guard DEACTIVATE against WBINVD/DF_FLUSH used in ASID recycling */
down_read(_deactivate_lock);
-   sev_guest_deactivate(data, NULL);
+   sev_guest_deactivate(, NULL);
up_read(_deactivate_lock);
 
-   kfree(data);
-
-   decommission = kzalloc(sizeof(*decommission), GFP_KERNEL);
-   if (!decommission)
-   return;
-
/* decommission handle */
-   decommission->handle = handle;
-   sev_guest_decommission(decommission, NULL);
-
-   kfree(decommission);
+   decommission.handle = handle;
+   sev_guest_decommission(, NULL);
 }
 
 static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
@@ -216,19 +203,14 @@ static int sev_guest_init(struct kvm *kvm, struct 
kvm_sev_cmd *argp)
 
 static int sev_bind_asid(struct kvm *kvm, unsigned int handle, int *error)
 {
-   struct sev_data_activate *data;
+   struct sev_data_activate activate;
int asid = sev_get_asid(kvm);
int ret;
 
-   data = kzalloc(sizeof(*data), GFP_KERNEL_ACCOUNT);
-   if (!data)
-   return -ENOMEM;
-
/* activate ASID on the given handle */
-   data->handle = handle;
-   data->asid   = asid;
-   ret = sev_guest_activate(data, error);
-   kfree(data);
+   activate.handle = handle;
+   activate.asid   = asid;
+   ret = sev_guest_activate(, error);
 
return ret;
 }
@@ -258,7 +240,7 @@ static int sev_issue_cmd(struct kvm *kvm, int id, void 
*data, int *error)
 static int sev_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
 {
struct kvm_sev_info *sev = _kvm_svm(kvm)->sev_info;
-   struct sev_data_launch_start *start;
+   struct sev_data_launch_start start;
struct kvm_sev_launch_start params;
void *dh_blob, *session_blob;
int *error = >error;
@@ -270,20 +252,16 @@ static int sev_launch_start(struct kvm *kvm, struct 
kvm_sev_cmd *argp)
if (copy_from_user(, (void __user *)(uintptr_t)argp->data, 
sizeof(params)))
return -EFAULT;
 
-   start = kzalloc(sizeof(*start), GFP_KERNEL_ACCOUNT);
-   if (!start)
-   return -ENOMEM;
+   memset(, 0, sizeof(start));
 
dh_blob = NULL;
if (params.dh_uaddr) {
dh_blob = psp_copy_user_blob(params.dh_uaddr, params.dh_len);
-   if (IS_ERR(dh_blob)) {
-   ret = PTR_ERR(dh_blob);
-   goto e_free;
-   }
+   if (IS_ERR(dh_blob))
+   return PTR_ERR(dh_blob);
 
-   start->dh_cert_address = __sme_set(__pa(dh_blob));
-   start->dh_cert_len = params.dh_len;
+   start.dh_cert_address = __sme_set(__pa(dh_blob));
+   start.dh_cert_len = params.dh_len;
}
 
session_blob = NULL;
@@ -294,40 +272,38 @@ static int sev_launch_start(struct kvm *kvm, struct 
kvm_sev_cmd *argp)
goto e_free_dh;
}
 
-   start->session_address = __sme_set(__pa(session_blob));
-   start->session_len = params.session_len;
+   start.session_address = __sme_set(__pa(session_blob));
+   start.session_len = params.session_len;
}
 
-   start->handle = params.handle;
-   start->policy = params.policy;
+   start.handle = params.handle;
+   start.policy = params.policy;
 
/* create memory encryption context */
-   ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_LAUNCH_START, start, error);
+   ret = 

[PATCH 4/5] crypto: ccp: Use the stack for small SEV command buffers

2021-04-02 Thread Sean Christopherson
For commands with small input/output buffers, use the local stack to
"allocate" the structures used to communicate with the PSP.   Now that
__sev_do_cmd_locked() gracefully handles vmalloc'd buffers, there's no
reason to avoid using the stack, e.g. CONFIG_VMAP_STACK=y will just work.

Signed-off-by: Sean Christopherson 
---
 drivers/crypto/ccp/sev-dev.c | 122 ++-
 1 file changed, 47 insertions(+), 75 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 6d5882290cfc..6a380d483fce 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -402,7 +402,7 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, 
bool writable)
 {
struct sev_device *sev = psp_master->sev_data;
struct sev_user_data_pek_csr input;
-   struct sev_data_pek_csr *data;
+   struct sev_data_pek_csr data;
void __user *input_address;
void *blob = NULL;
int ret;
@@ -413,29 +413,24 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd 
*argp, bool writable)
if (copy_from_user(, (void __user *)argp->data, sizeof(input)))
return -EFAULT;
 
-   data = kzalloc(sizeof(*data), GFP_KERNEL);
-   if (!data)
-   return -ENOMEM;
-
/* userspace wants to query CSR length */
-   if (!input.address || !input.length)
+   if (!input.address || !input.length) {
+   data.address = 0;
+   data.len = 0;
goto cmd;
+   }
 
/* allocate a physically contiguous buffer to store the CSR blob */
input_address = (void __user *)input.address;
-   if (input.length > SEV_FW_BLOB_MAX_SIZE) {
-   ret = -EFAULT;
-   goto e_free;
-   }
+   if (input.length > SEV_FW_BLOB_MAX_SIZE)
+   return -EFAULT;
 
blob = kmalloc(input.length, GFP_KERNEL);
-   if (!blob) {
-   ret = -ENOMEM;
-   goto e_free;
-   }
+   if (!blob)
+   return -ENOMEM;
 
-   data->address = __psp_pa(blob);
-   data->len = input.length;
+   data.address = __psp_pa(blob);
+   data.len = input.length;
 
 cmd:
if (sev->state == SEV_STATE_UNINIT) {
@@ -444,10 +439,10 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd 
*argp, bool writable)
goto e_free_blob;
}
 
-   ret = __sev_do_cmd_locked(SEV_CMD_PEK_CSR, data, >error);
+   ret = __sev_do_cmd_locked(SEV_CMD_PEK_CSR, , >error);
 
 /* If we query the CSR length, FW responded with expected data. */
-   input.length = data->len;
+   input.length = data.len;
 
if (copy_to_user((void __user *)argp->data, , sizeof(input))) {
ret = -EFAULT;
@@ -461,8 +456,6 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, 
bool writable)
 
 e_free_blob:
kfree(blob);
-e_free:
-   kfree(data);
return ret;
 }
 
@@ -594,7 +587,7 @@ static int sev_ioctl_do_pek_import(struct sev_issue_cmd 
*argp, bool writable)
 {
struct sev_device *sev = psp_master->sev_data;
struct sev_user_data_pek_cert_import input;
-   struct sev_data_pek_cert_import *data;
+   struct sev_data_pek_cert_import data;
void *pek_blob, *oca_blob;
int ret;
 
@@ -604,19 +597,14 @@ static int sev_ioctl_do_pek_import(struct sev_issue_cmd 
*argp, bool writable)
if (copy_from_user(, (void __user *)argp->data, sizeof(input)))
return -EFAULT;
 
-   data = kzalloc(sizeof(*data), GFP_KERNEL);
-   if (!data)
-   return -ENOMEM;
-
/* copy PEK certificate blobs from userspace */
pek_blob = psp_copy_user_blob(input.pek_cert_address, 
input.pek_cert_len);
-   if (IS_ERR(pek_blob)) {
-   ret = PTR_ERR(pek_blob);
-   goto e_free;
-   }
+   if (IS_ERR(pek_blob))
+   return PTR_ERR(pek_blob);
 
-   data->pek_cert_address = __psp_pa(pek_blob);
-   data->pek_cert_len = input.pek_cert_len;
+   data.reserved = 0;
+   data.pek_cert_address = __psp_pa(pek_blob);
+   data.pek_cert_len = input.pek_cert_len;
 
/* copy PEK certificate blobs from userspace */
oca_blob = psp_copy_user_blob(input.oca_cert_address, 
input.oca_cert_len);
@@ -625,8 +613,8 @@ static int sev_ioctl_do_pek_import(struct sev_issue_cmd 
*argp, bool writable)
goto e_free_pek;
}
 
-   data->oca_cert_address = __psp_pa(oca_blob);
-   data->oca_cert_len = input.oca_cert_len;
+   data.oca_cert_address = __psp_pa(oca_blob);
+   data.oca_cert_len = input.oca_cert_len;
 
/* If platform is not in INIT state then transition it to INIT */
if (sev->state != SEV_STATE_INIT) {
@@ -635,21 +623,19 @@ static int sev_ioctl_do_pek_import(struct sev_issue_cmd 
*argp, bool writable)
goto e_free_oca;
}
 
-   ret = 

[PATCH 3/5] crypto: ccp: Play nice with vmalloc'd memory for SEV command structs

2021-04-02 Thread Sean Christopherson
Copy vmalloc'd data to an internal buffer instead of rejecting outright
so that callers can put SEV command buffers on the stack without running
afoul of CONFIG_VMAP_STACK=y.  Currently, the largest supported command
takes a 68 byte buffer, i.e. pretty much every command can be put on the
stack.  Because sev_cmd_mutex is held for the entirety of a transaction,
only a single bounce buffer is required.

Use a flexible array for the buffer, sized to hold the largest known
command.   Alternatively, the buffer could be a union of all known
command structs, but that would incur a higher maintenance cost due to
the need to update the union for every command in addition to updating
the existing sev_cmd_buffer_len().

Align the buffer to an 8-byte boundary, mimicking the alignment that
would be provided by the compiler if any of the structs were embedded
directly.  Note, sizeof() correctly incorporates this alignment.

Cc: Brijesh Singh 
Cc: Borislav Petkov 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 drivers/crypto/ccp/sev-dev.c | 33 +++--
 drivers/crypto/ccp/sev-dev.h |  7 +++
 2 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 4c513318f16a..6d5882290cfc 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -135,13 +135,14 @@ static int sev_cmd_buffer_len(int cmd)
return 0;
 }
 
-static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
+static int __sev_do_cmd_locked(int cmd, void *__data, int *psp_ret)
 {
struct psp_device *psp = psp_master;
struct sev_device *sev;
unsigned int phys_lsb, phys_msb;
unsigned int reg, ret = 0;
int buf_len;
+   void *data;
 
if (!psp || !psp->sev_data)
return -ENODEV;
@@ -152,11 +153,21 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
sev = psp->sev_data;
 
buf_len = sev_cmd_buffer_len(cmd);
-   if (WARN_ON_ONCE(!!data != !!buf_len))
+   if (WARN_ON_ONCE(!!__data != !!buf_len))
return -EINVAL;
 
-   if (WARN_ON_ONCE(data && is_vmalloc_addr(data)))
-   return -EINVAL;
+   if (__data && is_vmalloc_addr(__data)) {
+   /*
+* If the incoming buffer is virtually allocated, copy it to
+* the driver's scratch buffer as __pa() will not work for such
+* addresses, vmalloc_to_page() is not guaranteed to succeed,
+* and vmalloc'd data may not be physically contiguous.
+*/
+   data = sev->cmd_buf;
+   memcpy(data, __data, buf_len);
+   } else {
+   data = __data;
+   }
 
/* Get the physical address of the command buffer */
phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
@@ -204,6 +215,13 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
 buf_len, false);
 
+   /*
+* Copy potential output from the PSP back to __data.  Do this even on
+* failure in case the caller wants to glean something from the error.
+*/
+   if (__data && data != __data)
+   memcpy(__data, data, buf_len);
+
return ret;
 }
 
@@ -978,9 +996,12 @@ int sev_dev_init(struct psp_device *psp)
 {
struct device *dev = psp->dev;
struct sev_device *sev;
-   int ret = -ENOMEM;
+   int ret = -ENOMEM, cmd_buf_size = 0, i;
 
-   sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
+   for (i = 0; i < SEV_CMD_MAX; i++)
+   cmd_buf_size = max(cmd_buf_size, sev_cmd_buffer_len(i));
+
+   sev = devm_kzalloc(dev, sizeof(*sev) + cmd_buf_size, GFP_KERNEL);
if (!sev)
goto e_err;
 
diff --git a/drivers/crypto/ccp/sev-dev.h b/drivers/crypto/ccp/sev-dev.h
index dd5c4fe82914..b43283ce2d73 100644
--- a/drivers/crypto/ccp/sev-dev.h
+++ b/drivers/crypto/ccp/sev-dev.h
@@ -52,6 +52,13 @@ struct sev_device {
u8 api_major;
u8 api_minor;
u8 build;
+
+   /*
+* Buffer used for incoming commands whose physical address cannot be
+* resolved via __pa(), e.g. stack pointers when CONFIG_VMAP_STACK=y.
+* Note, alignment isn't strictly required.
+*/
+   u8 cmd_buf[] __aligned(8);
 };
 
 int sev_dev_init(struct psp_device *psp);
-- 
2.31.0.208.g409f899ff0-goog



[PATCH 2/5] crypto: ccp: Reject SEV commands with mismatching command buffer

2021-04-02 Thread Sean Christopherson
WARN on and reject SEV commands that provide a valid data pointer, but do
not have a known, non-zero length.  And conversely, reject commands that
take a command buffer but none is provided.

Aside from sanity checking intput, disallowing a non-null pointer without
a non-zero size will allow a future patch to cleanly handle vmalloc'd
data by copying the data to an internal __pa() friendly buffer.

Note, this also effectively prevents callers from using commands that
have a non-zero length and are not known to the kernel.  This is not an
explicit goal, but arguably the side effect is a good thing from the
kernel's perspective.

Cc: Brijesh Singh 
Cc: Borislav Petkov 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 drivers/crypto/ccp/sev-dev.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 6556d220713b..4c513318f16a 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -141,6 +141,7 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
struct sev_device *sev;
unsigned int phys_lsb, phys_msb;
unsigned int reg, ret = 0;
+   int buf_len;
 
if (!psp || !psp->sev_data)
return -ENODEV;
@@ -150,7 +151,11 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
 
sev = psp->sev_data;
 
-   if (data && WARN_ON_ONCE(is_vmalloc_addr(data)))
+   buf_len = sev_cmd_buffer_len(cmd);
+   if (WARN_ON_ONCE(!!data != !!buf_len))
+   return -EINVAL;
+
+   if (WARN_ON_ONCE(data && is_vmalloc_addr(data)))
return -EINVAL;
 
/* Get the physical address of the command buffer */
@@ -161,7 +166,7 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
cmd, phys_msb, phys_lsb, psp_timeout);
 
print_hex_dump_debug("(in):  ", DUMP_PREFIX_OFFSET, 16, 2, data,
-sev_cmd_buffer_len(cmd), false);
+buf_len, false);
 
iowrite32(phys_lsb, sev->io_regs + sev->vdata->cmdbuff_addr_lo_reg);
iowrite32(phys_msb, sev->io_regs + sev->vdata->cmdbuff_addr_hi_reg);
@@ -197,7 +202,7 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
}
 
print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
-sev_cmd_buffer_len(cmd), false);
+buf_len, false);
 
return ret;
 }
-- 
2.31.0.208.g409f899ff0-goog



[PATCH 0/5] ccp: KVM: SVM: Use stack for SEV command buffers

2021-04-02 Thread Sean Christopherson
While doing minor KVM cleanup to account various kernel allocations, I
noticed that all of the SEV command buffers are allocated via kmalloc(),
even for commands whose payloads is smaller than a pointer.  After much
head scratching, the only reason I could come up with for dynamically
allocating the command data is CONFIG_VMAP_STACK=y.

This series teaches __sev_do_cmd_locked() to gracefully handle vmalloc'd
command buffers by copying such buffers an internal buffer before sending
the command to the PSP.  The SEV driver and KVM are then converted to use
the stack for all command buffers.

The first patch is optional, I included it in case someone wants to
backport it to stable kernels.  It wouldn't actually fix bugs, but it
would make debugging issues a lot easier if they did pop up.

Tested everything except sev_ioctl_do_pek_import(), I don't know anywhere
near enough about the PSP to give it the right input.

Based on kvm/queue, commit f96be2deac9b ("KVM: x86: Support KVM VMs
sharing SEV context") to avoid a minor conflict.

Sean Christopherson (5):
  crypto: ccp: Detect and reject vmalloc addresses destined for PSP
  crypto: ccp: Reject SEV commands with mismatching command buffer
  crypto: ccp: Play nice with vmalloc'd memory for SEV command structs
  crypto: ccp: Use the stack for small SEV command buffers
  KVM: SVM: Allocate SEV command structures on local stack

 arch/x86/kvm/svm/sev.c   | 262 +--
 drivers/crypto/ccp/sev-dev.c | 161 ++---
 drivers/crypto/ccp/sev-dev.h |   7 +
 3 files changed, 184 insertions(+), 246 deletions(-)

-- 
2.31.0.208.g409f899ff0-goog



[PATCH 1/5] crypto: ccp: Detect and reject vmalloc addresses destined for PSP

2021-04-02 Thread Sean Christopherson
Explicitly reject vmalloc'd data as the source for SEV commands that are
sent to the PSP.  The PSP works with physical addresses, and __pa() will
not return the correct address for a vmalloc'd pionter, which at best
will cause the command to fail, and at worst lead to system instability.

While it's unlikely that callers will deliberately use vmalloc() for SEV
buffers, a caller can easily use a vmalloc'd pointer unknowingly when
running with CONFIG_VMAP_STACK=y as it's not obvious that putting the
command buffers on the stack would be bad.  The command buffers are
relative small and easily fit on the stack, and the APIs to do not
document that the incoming pointer must be a physically contiguous,
__pa() friendly pointer.

Cc: Brijesh Singh 
Cc: Borislav Petkov 
Cc: Tom Lendacky 
Fixes: 200664d5237f ("crypto: ccp: Add Secure Encrypted Virtualization (SEV) 
command support")
Signed-off-by: Sean Christopherson 
---
 drivers/crypto/ccp/sev-dev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index cb9b4c4e371e..6556d220713b 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -150,6 +150,9 @@ static int __sev_do_cmd_locked(int cmd, void *data, int 
*psp_ret)
 
sev = psp->sev_data;
 
+   if (data && WARN_ON_ONCE(is_vmalloc_addr(data)))
+   return -EINVAL;
+
/* Get the physical address of the command buffer */
phys_lsb = data ? lower_32_bits(__psp_pa(data)) : 0;
phys_msb = data ? upper_32_bits(__psp_pa(data)) : 0;
-- 
2.31.0.208.g409f899ff0-goog



test

2021-04-02 Thread Abbas Maheryar
this is a test


Re: CFI violation in drivers/infiniband/core/sysfs.c

2021-04-02 Thread Jason Gunthorpe
On Fri, Apr 02, 2021 at 04:03:30PM -0700, Kees Cook wrote:

> > relevant. It seems to me that the hw_counters 'struct attribute_group'
> > should probably be its own kobj within both of these structures so they
> > can have their own sysfs ops (unless there is some other way to do this
> > that I am missing).

Err, yes, every subclass of the attribute should be accompanied by a
distinct kobject type to relay the show methods with typesafety, this
is how this design pattern is intended to be used.

If I understand your report properly the hw_stats_attribute is being
assigned to a 'port_type' kobject and it only works by pure luck because
the show/store happens to overlap between port and hsa attributes?

> > I would appreciate someone else taking a look and seeing if I am off
> > base or if there is an easier way to solve this.
> 
> So, it seems that the reason for a custom kobj_type here is to use the
> .release callback. 

Every kobject should be associated with a specific, fixed, attribute
type. The purpose of the wrappers is to inject type safety so the
attribute implementations know they are working on the right stuff.

In turn, an attribute of a specific attribute type can only be
injected into a kobject that has the matching type.

This code is insane because it does this:

hsag->attrs[i] = alloc_hsa(i, port_num, stats->names[i]);

// This is port kobject
struct kobject *kobj = >kobj;
ret = sysfs_create_group(kobj, hsag); 
[..]
// This is a struct device kobject
struct kobject *kobj = >dev.kobj;
ret = sysfs_create_group(kobj, hsag); 

Which is absolutely not allowed, you can't have a generic attribute
handler and stuff it into two different types! Things MUST use the
proper attribute subclass for what they are being attached to.

The answer is that the setup_hw_stats_() for port and device must
be split up and the attribute implementations for each of them have to
subclass starting from the correct type.

So we'd end up with two attributes for hw_stats

struct port_hw_stats {
struct port_attr attr;
struct hw_stats_data data;
};


struct device_hw_stats {
struct device_attr attr;
struct hw_stats_data data;
};

And then two show/set functions that bounce through the correct types
to the data.

And so on.

Jason


[tip:x86/core] BUILD SUCCESS b1f480bc0686e65d5413c035bd13af2ea4888784

2021-04-02 Thread kernel test robot
ia64 bigsur_defconfig
xtensa   common_defconfig
mips   gcw0_defconfig
mipsomega2p_defconfig
arm palmz72_defconfig
arm  pxa168_defconfig
sh   se7780_defconfig
mips tb0219_defconfig
mipsmalta_qemu_32r6_defconfig
powerpc mpc836x_mds_defconfig
sh   se7750_defconfig
arm lpc18xx_defconfig
arm  moxart_defconfig
s390 allyesconfig
sh  urquell_defconfig
sh sh03_defconfig
sh   se7712_defconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allmodconfig
parisc   allyesconfig
i386defconfig
sparcallyesconfig
sparc   defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210402
x86_64   randconfig-a005-20210402
x86_64   randconfig-a003-20210402
x86_64   randconfig-a001-20210402
x86_64   randconfig-a002-20210402
x86_64   randconfig-a006-20210402
i386 randconfig-a006-20210402
i386 randconfig-a003-20210402
i386 randconfig-a001-20210402
i386 randconfig-a004-20210402
i386 randconfig-a005-20210402
i386 randconfig-a002-20210402
i386 randconfig-a006-20210401
i386 randconfig-a003-20210401
i386 randconfig-a001-20210401
i386 randconfig-a004-20210401
i386 randconfig-a002-20210401
i386 randconfig-a005-20210401
i386 randconfig-a014-20210402
i386 randconfig-a016-20210402
i386 randconfig-a011-20210402
i386 randconfig-a012-20210402
i386 randconfig-a013-20210402
i386 randconfig-a015-20210402
i386 randconfig-a014-20210401
i386 randconfig-a011-20210401
i386 randconfig-a016-20210401
i386 randconfig-a012-20210401
i386 randconfig-a013-20210401
i386 randconfig-a015-20210401
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
um   allyesconfig
um  defconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a004-20210401
x86_64   randconfig-a005-20210401
x86_64   randconfig-a003-20210401
x86_64   randconfig-a001-20210401
x86_64   randconfig-a002-20210401
x86_64   randconfig-a006-20210401
x86_64   randconfig-a014-20210402
x86_64   randconfig-a015-20210402
x86_64   randconfig-a011-20210402
x86_64   randconfig-a013-20210402
x86_64   randconfig-a012-20210402
x86_64   randconfig-a016-20210402

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:WIP.x86/core] BUILD SUCCESS 9bc0bb50727c8ac69fbb33fb937431cf3518ff37

2021-04-02 Thread kernel test robot
 tb0219_defconfig
mipsmalta_qemu_32r6_defconfig
powerpc mpc836x_mds_defconfig
sh   se7750_defconfig
riscvnommu_k210_defconfig
arm lpc18xx_defconfig
arm  moxart_defconfig
s390 allyesconfig
sh  urquell_defconfig
sh sh03_defconfig
sh   se7712_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allmodconfig
parisc   allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210402
x86_64   randconfig-a005-20210402
x86_64   randconfig-a003-20210402
x86_64   randconfig-a001-20210402
x86_64   randconfig-a002-20210402
x86_64   randconfig-a006-20210402
i386 randconfig-a006-20210402
i386 randconfig-a003-20210402
i386 randconfig-a001-20210402
i386 randconfig-a004-20210402
i386 randconfig-a005-20210402
i386 randconfig-a002-20210402
i386 randconfig-a006-20210401
i386 randconfig-a003-20210401
i386 randconfig-a001-20210401
i386 randconfig-a004-20210401
i386 randconfig-a002-20210401
i386 randconfig-a005-20210401
i386 randconfig-a014-20210402
i386 randconfig-a016-20210402
i386 randconfig-a011-20210402
i386 randconfig-a012-20210402
i386 randconfig-a013-20210402
i386 randconfig-a015-20210402
i386 randconfig-a014-20210401
i386 randconfig-a011-20210401
i386 randconfig-a016-20210401
i386 randconfig-a012-20210401
i386 randconfig-a013-20210401
i386 randconfig-a015-20210401
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
um   allyesconfig
um  defconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a004-20210401
x86_64   randconfig-a005-20210401
x86_64   randconfig-a003-20210401
x86_64   randconfig-a001-20210401
x86_64   randconfig-a002-20210401
x86_64   randconfig-a006-20210401
x86_64   randconfig-a014-20210402
x86_64   randconfig-a015-20210402
x86_64   randconfig-a011-20210402
x86_64   randconfig-a013-20210402
x86_64   randconfig-a012-20210402
x86_64   randconfig-a016-20210402

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[tip:master] BUILD SUCCESS 839b9f0ebe3046b258601e1785ac8c4483266f92

2021-04-02 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
branch HEAD: 839b9f0ebe3046b258601e1785ac8c4483266f92  Merge branch 
'WIP.x86/core'

elapsed time: 722m

configs tested: 123
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
x86_64   allyesconfig
riscvallmodconfig
i386 allyesconfig
riscvallyesconfig
alphaalldefconfig
arm   sama5_defconfig
sh apsh4a3a_defconfig
arm   omap1_defconfig
arm   h5000_defconfig
powerpc redwood_defconfig
pariscgeneric-32bit_defconfig
mipsgpr_defconfig
powerpc tqm8540_defconfig
sh magicpanelr2_defconfig
m68k  hp300_defconfig
powerpc mpc8540_ads_defconfig
m68kmvme16x_defconfig
arcnsim_700_defconfig
arm lpc32xx_defconfig
powerpc tqm8560_defconfig
h8300   h8s-sim_defconfig
umallnoconfig
arm  badge4_defconfig
mips  loongson3_defconfig
arc nsimosci_hs_defconfig
powerpc  mpc885_ads_defconfig
arc  axs103_smp_defconfig
powerpc mpc832x_mds_defconfig
shtitan_defconfig
mips   ip32_defconfig
arm   corgi_defconfig
m68k   m5208evb_defconfig
mips   capcella_defconfig
arm  exynos_defconfig
powerpc mpc837x_rdb_defconfig
sh   se7750_defconfig
riscvnommu_k210_defconfig
arm lpc18xx_defconfig
arm  moxart_defconfig
s390 allyesconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210402
x86_64   randconfig-a005-20210402
x86_64   randconfig-a003-20210402
x86_64   randconfig-a001-20210402
x86_64   randconfig-a002-20210402
x86_64   randconfig-a006-20210402
i386 randconfig-a006-20210402
i386 randconfig-a003-20210402
i386 randconfig-a001-20210402
i386 randconfig-a004-20210402
i386 randconfig-a005-20210402
i386 randconfig-a002-20210402
i386 randconfig-a014-20210402
i386 randconfig-a016-20210402
i386 randconfig-a011-20210402
i386 randconfig-a012-20210402
i386 randconfig-a013-20210402
i386 randconfig-a015-20210402
i386 randconfig-a014-20210401
i386 randconfig-a011-20210401
i386 randconfig-a016-20210401
i386 randconfig-a012-20210401
i386 randconfig-a013-20210401
i386 randconfig-a015-20210401
riscvnommu_virt_defconfig
riscv allnoconfig
riscv

Re: CFI violation in drivers/infiniband/core/sysfs.c

2021-04-02 Thread Kees Cook
On Fri, Apr 02, 2021 at 12:52:41PM -0700, Nathan Chancellor wrote:
> Hi all,
> 
> I am testing the Clang Control Flow Integrity series that is being
> worked on right now [1] and I encounter a violation in the Infiniband
> sysfs core (drivers/infiniband/core/sysfs.c) on an arm64 server with mlx5:
> 
> $ cat /sys/class/infiniband/mlx5_bond_0/ports/1/hw_counters/lifespan
> 12
> 
> $ echo "10" | sudo tee 
> /sys/class/infiniband/mlx5_bond_0/ports/1/hw_counters/lifespan
> 10
> 
> $ sudo dmesg
> [64198.670342] [ cut here ]
> [64198.670362] CFI failure (target: show_stats_lifespan+0x0/0x8 [ib_core]):
> [64198.671291] WARNING: CPU: 20 PID: 15786 at kernel/cfi.c:29 
> __ubsan_handle_cfi_check_fail+0x58/0x60
> [64198.671336] Modules linked in: binfmt_misc nls_iso8859_1 dm_multipath
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua ast drm_vram_helper drm_ttm_helper
> ttm aes_ce_blk crypto_simd drm_kms_helper cryptd cec rc_core
> aes_ce_cipher crct10dif_ce sysimgblt ghash_ce syscopyarea sysfillrect
> acpi_ipmi sha2_ce fb_sys_fops ipmi_ssif sha256_arm64 ipmi_devintf
> sha1_ce drm efi_pstore sbsa_gwdt tcp_westwood evbug ipmi_msghandler
> cppc_cpufreq xgene_hwmon ib_iser rdma_cm iw_cm ib_cm iscsi_tcp
> libiscsi_tcp libiscsi scsi_transport_iscsi bonding ip_tables x_tables
> autofs4 raid10 raid456 libcrc32c async_raid6_recov async_pq raid6_pq
> async_xor xor xor_neon async_memcpy async_tx raid1 raid0 multipath
> linear mlx5_ib ib_uverbs ib_core mlx5_core mlxfw igb i2c_algo_bit tls
> i2c_xgene_slimpro ahci_platform gpio_dwapb
> [64198.671958] CPU: 20 PID: 15786 Comm: cat Tainted: GW 
> 5.12.0-rc5+ #5
> [64198.671980] Hardware name: Lenovo HR330A7X33CTO1WW/HR350A  
>, BIOS HVE104D-1.02 03/08/2019
> [64198.671993] pstate: 6045 (nZCv daif +PAN -UAO -TCO BTYPE=--)
> [64198.672016] pc : __ubsan_handle_cfi_check_fail+0x58/0x60
> [64198.672036] lr : __ubsan_handle_cfi_check_fail+0x58/0x60
> [64198.672054] sp : 800014ea3b50
> [64198.672065] x29: 800014ea3b50 x28: 80001122da60 
> [64198.672095] x27: 80001122ad80 x26: 800011233088 
> [64198.672122] x25: 000801821a78 x24: 000820cda200 
> [64198.672148] x23: 89aa9f60 x22: 89a66000 
> [64198.672173] x21: 7dac81f92e1d85cf x20: 89abddd0 
> [64198.672198] x19: 89a69fd8 x18:  
> [64198.672223] x17:  x16:  
> [64198.672250] x15: 0004 x14: 1fff 
> [64198.672277] x13: 8000121412a8 x12: 0003 
> [64198.672303] x11:  x10: 0027 
> [64198.672329] x9 : 4568e3af67e9f000 x8 : 4568e3af67e9f000 
> [64198.672356] x7 : 6e6170736566696c x6 : 8000124699c9 
> [64198.672381] x5 :  x4 : 0001 
> [64198.672406] x3 :  x2 :  
> [64198.672431] x1 : 8000119b905d x0 : 003c 
> [64198.672457] Call trace:
> [64198.672469]  __ubsan_handle_cfi_check_fail+0x58/0x60
> [64198.672489]  __cfi_check_fail+0x3c/0x44 [ib_core]
> [64198.673362]  __cfi_check+0x2e78/0x31b0 [ib_core]
> [64198.674230]  port_attr_show+0x88/0x98 [ib_core]
> [64198.675098]  sysfs_kf_seq_show+0xc4/0x160
> [64198.675131]  kernfs_seq_show+0x5c/0xa4
> [64198.675157]  seq_read_iter+0x178/0x60c
> [64198.675176]  kernfs_fop_read_iter+0x78/0x1fc
> [64198.675202]  vfs_read+0x2d0/0x34c
> [64198.675220]  ksys_read+0x80/0xec
> [64198.675237]  __arm64_sys_read+0x28/0x34
> [64198.675253]  el0_svc_common.llvm.13467398108545334879+0xbc/0x1f0
> [64198.675277]  do_el0_svc+0x30/0xa4
> [64198.675293]  el0_svc+0x30/0xb0
> [64198.675314]  el0_sync_handler+0x84/0xe4
> [64198.675333]  el0_sync+0x174/0x180
> [64198.675351] ---[ end trace a253e31759778f5c ]---
> [64216.024673] [ cut here ]
> [64216.024678] CFI failure (target: set_stats_lifespan+0x0/0x8 [ib_core]):
> [64216.024824] WARNING: CPU: 3 PID: 15816 at kernel/cfi.c:29 
> __ubsan_handle_cfi_check_fail+0x58/0x60
> [64216.024832] Modules linked in: binfmt_misc nls_iso8859_1 dm_multipath
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua ast drm_vram_helper drm_ttm_helper
> ttm aes_ce_blk crypto_simd drm_kms_helper cryptd cec rc_core
> aes_ce_cipher crct10dif_ce sysimgblt ghash_ce syscopyarea sysfillrect
> acpi_ipmi sha2_ce fb_sys_fops ipmi_ssif sha256_arm64 ipmi_devintf
> sha1_ce drm efi_pstore sbsa_gwdt tcp_westwood evbug ipmi_msghandler
> cppc_cpufreq xgene_hwmon ib_iser rdma_cm iw_cm ib_cm iscsi_tcp
> libiscsi_tcp libiscsi scsi_transport_iscsi bonding ip_tables x_tables
> autofs4 raid10 raid456 libcrc32c async_raid6_recov async_pq raid6_pq
> async_xor xor xor_neon async_memcpy async_tx raid1 raid0 multipath
> linear mlx5_ib ib_uverbs ib_core mlx5_core mlxfw igb i2c_algo_bit tls
> i2c_xgene_slimpro ahci_platform gpio_dwapb
> [64216.024922] CPU: 3 PID: 15816 Comm: tee Tainted: GW 
> 5.12.0-rc5+ #5
> [64216.024925] Hardware name: Lenovo HR330A7X33CTO1WW/HR350A  
>, BIOS 

[PATCH v7 clocksource 5/5] clocksource: Do pairwise clock-desynchronization checking

2021-04-02 Thread paulmck
From: "Paul E. McKenney" 

Although smp_call_function() has the advantage of simplicity, using
it to check for cross-CPU clock desynchronization means that any CPU
being slow reduces the sensitivity of the checking across all CPUs.
And it is not uncommon for smp_call_function() latencies to be in the
hundreds of microseconds.

This commit therefore switches to smp_call_function_single(), so that
delays from a given CPU affect only those measurements involving that
particular CPU.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
Reported-by: Chris Mason 
Signed-off-by: Paul E. McKenney 
---
 kernel/time/clocksource.c | 41 +
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index df48416..4161c84 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -214,7 +214,7 @@ static void clocksource_watchdog_inject_delay(void)
 }
 
 static struct clocksource *clocksource_verify_work_cs;
-static DEFINE_PER_CPU(u64, csnow_mid);
+static u64 csnow_mid;
 static cpumask_t cpus_ahead;
 static cpumask_t cpus_behind;
 
@@ -228,7 +228,7 @@ static void clocksource_verify_one_cpu(void *csin)
sign = ((smp_processor_id() >> inject_delay_shift_percpu) & 
0x1) * 2 - 1;
delta = sign * NSEC_PER_SEC;
}
-   __this_cpu_write(csnow_mid, cs->read(cs) + delta);
+   csnow_mid = cs->read(cs) + delta;
 }
 
 static void clocksource_verify_percpu_wq(struct work_struct *unused)
@@ -236,9 +236,12 @@ static void clocksource_verify_percpu_wq(struct 
work_struct *unused)
int cpu;
struct clocksource *cs;
int64_t cs_nsec;
+   int64_t cs_nsec_max;
+   int64_t cs_nsec_min;
u64 csnow_begin;
u64 csnow_end;
-   u64 delta;
+   s64 delta;
+   bool firsttime = 1;
 
cs = smp_load_acquire(_verify_work_cs); // pairs with 
release
if (WARN_ON_ONCE(!cs))
@@ -247,19 +250,28 @@ static void clocksource_verify_percpu_wq(struct 
work_struct *unused)
cs->name, smp_processor_id());
cpumask_clear(_ahead);
cpumask_clear(_behind);
-   csnow_begin = cs->read(cs);
-   smp_call_function(clocksource_verify_one_cpu, cs, 1);
-   csnow_end = cs->read(cs);
+   preempt_disable();
for_each_online_cpu(cpu) {
if (cpu == smp_processor_id())
continue;
-   delta = (per_cpu(csnow_mid, cpu) - csnow_begin) & cs->mask;
-   if ((s64)delta < 0)
+   csnow_begin = cs->read(cs);
+   smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 
1);
+   csnow_end = cs->read(cs);
+   delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
+   if (delta < 0)
cpumask_set_cpu(cpu, _behind);
-   delta = (csnow_end - per_cpu(csnow_mid, cpu)) & cs->mask;
-   if ((s64)delta < 0)
+   delta = (csnow_end - csnow_mid) & cs->mask;
+   if (delta < 0)
cpumask_set_cpu(cpu, _ahead);
+   delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
+   cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+   if (firsttime || cs_nsec > cs_nsec_max)
+   cs_nsec_max = cs_nsec;
+   if (firsttime || cs_nsec < cs_nsec_min)
+   cs_nsec_min = cs_nsec;
+   firsttime = 0;
}
+   preempt_enable();
if (!cpumask_empty(_ahead))
pr_warn("CPUs %*pbl ahead of CPU %d for clocksource 
%s.\n",
cpumask_pr_args(_ahead),
@@ -268,12 +280,9 @@ static void clocksource_verify_percpu_wq(struct 
work_struct *unused)
pr_warn("CPUs %*pbl behind CPU %d for clocksource 
%s.\n",
cpumask_pr_args(_behind),
smp_processor_id(), cs->name);
-   if (!cpumask_empty(_ahead) || !cpumask_empty(_behind)) {
-   delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
-   cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
-   pr_warn("CPU %d duration %lldns for clocksource %s.\n",
-   smp_processor_id(), cs_nsec, cs->name);
-   }
+   if (!firsttime && (!cpumask_empty(_ahead) || 
!cpumask_empty(_behind)))
+   pr_warn("CPU %d check durations %lldns - %lldns for 
clocksource %s.\n",
+   smp_processor_id(), cs_nsec_min, cs_nsec_max, cs->name);
smp_store_release(_verify_work_cs, NULL); // pairs with 
acquire.
 }
 
-- 
2.9.5



[PATCH v7 clocksource 4/5] clocksource: Provide a module parameter to fuzz per-CPU clock checking

2021-04-02 Thread paulmck
From: "Paul E. McKenney" 

Code that checks for clock desynchronization must itself be tested, so
this commit creates a new clocksource.inject_delay_shift_percpu= kernel
boot parameter that adds or subtracts a large value from the check read,
using the specified bit of the CPU ID to determine whether to add or
to subtract.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
Reported-by: Chris Mason 
[ paulmck: Apply Randy Dunlap feedback. ]
Signed-off-by: Paul E. McKenney 
---
 Documentation/admin-guide/kernel-parameters.txt | 16 
 kernel/time/clocksource.c   | 10 +-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index fc57952..f9da90f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -599,6 +599,22 @@
times the value specified for inject_delay_freq
of consecutive non-delays.
 
+   clocksource.inject_delay_shift_percpu= [KNL]
+   Clocksource delay injection partitions the CPUs
+   into two sets, one whose clocks are moved ahead
+   and the other whose clocks are moved behind.
+   This kernel parameter selects the CPU-number
+   bit that determines which of these two sets the
+   corresponding CPU is placed into.  For example,
+   setting this parameter to the value 4 will result
+   in the first set containing alternating groups
+   of 16 CPUs whose clocks are moved ahead, while
+   the second set will contain the rest of the CPUs,
+   whose clocks are moved behind.
+
+   The default value of -1 disables this type of
+   error injection.
+
clocksource.max_read_retries= [KNL]
Number of clocksource_watchdog() retries due to
external delays before the clock will be marked
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 663bc53..df48416 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -190,6 +190,8 @@ static int inject_delay_freq;
 module_param(inject_delay_freq, int, 0644);
 static int inject_delay_run = 1;
 module_param(inject_delay_run, int, 0644);
+static int inject_delay_shift_percpu = -1;
+module_param(inject_delay_shift_percpu, int, 0644);
 static int max_read_retries = 3;
 module_param(max_read_retries, int, 0644);
 
@@ -219,8 +221,14 @@ static cpumask_t cpus_behind;
 static void clocksource_verify_one_cpu(void *csin)
 {
struct clocksource *cs = (struct clocksource *)csin;
+   s64 delta = 0;
+   int sign;
 
-   __this_cpu_write(csnow_mid, cs->read(cs));
+   if (inject_delay_shift_percpu >= 0) {
+   sign = ((smp_processor_id() >> inject_delay_shift_percpu) & 
0x1) * 2 - 1;
+   delta = sign * NSEC_PER_SEC;
+   }
+   __this_cpu_write(csnow_mid, cs->read(cs) + delta);
 }
 
 static void clocksource_verify_percpu_wq(struct work_struct *unused)
-- 
2.9.5



[PATCH v7 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-02 Thread paulmck
From: "Paul E. McKenney" 

Some sorts of per-CPU clock sources have a history of going out of
synchronization with each other.  However, this problem has purportedy
been solved in the past ten years.  Except that it is all too possible
that the problem has instead simply been made less likely, which might
mean that some of the occasional "Marking clocksource 'tsc' as unstable"
messages might be due to desynchronization.  How would anyone know?

This commit therefore adds CPU-to-CPU synchronization checking
for newly unstable clocksource that are marked with the new
CLOCK_SOURCE_VERIFY_PERCPU flag.  Lists of desynchronized CPUs are
printed, with the caveat that if it is the reporting CPU that is itself
desynchronized, it will appear that all the other clocks are wrong.
Just like in real life.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
Reported-by: Chris Mason 
[ paulmck: Add "static" to clocksource_verify_one_cpu() per kernel test robot 
feedback. ]
Signed-off-by: Paul E. McKenney 
---
 arch/x86/kernel/kvmclock.c  |  2 +-
 arch/x86/kernel/tsc.c   |  3 +-
 include/linux/clocksource.h |  2 +-
 kernel/time/clocksource.c   | 73 +
 4 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1fc0962..97eeaf1 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -169,7 +169,7 @@ struct clocksource kvm_clock = {
.read   = kvm_clock_get_cycles,
.rating = 400,
.mask   = CLOCKSOURCE_MASK(64),
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+   .flags  = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VERIFY_PERCPU,
.enable = kvm_cs_enable,
 };
 EXPORT_SYMBOL_GPL(kvm_clock);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index f70dffc..5628917 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = {
.mask   = CLOCKSOURCE_MASK(64),
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_VALID_FOR_HRES |
- CLOCK_SOURCE_MUST_VERIFY,
+ CLOCK_SOURCE_MUST_VERIFY |
+ CLOCK_SOURCE_VERIFY_PERCPU,
.vdso_clock_mode= VDSO_CLOCKMODE_TSC,
.enable = tsc_cs_enable,
.resume = tsc_resume,
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143d..83a3ebf 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -131,7 +131,7 @@ struct clocksource {
 #define CLOCK_SOURCE_UNSTABLE  0x40
 #define CLOCK_SOURCE_SUSPEND_NONSTOP   0x80
 #define CLOCK_SOURCE_RESELECT  0x100
-
+#define CLOCK_SOURCE_VERIFY_PERCPU 0x200
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) GENMASK_ULL((bits) - 1, 0)
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 3f734c6..663bc53 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -211,6 +211,78 @@ static void clocksource_watchdog_inject_delay(void)
WARN_ON_ONCE(injectfail < 0);
 }
 
+static struct clocksource *clocksource_verify_work_cs;
+static DEFINE_PER_CPU(u64, csnow_mid);
+static cpumask_t cpus_ahead;
+static cpumask_t cpus_behind;
+
+static void clocksource_verify_one_cpu(void *csin)
+{
+   struct clocksource *cs = (struct clocksource *)csin;
+
+   __this_cpu_write(csnow_mid, cs->read(cs));
+}
+
+static void clocksource_verify_percpu_wq(struct work_struct *unused)
+{
+   int cpu;
+   struct clocksource *cs;
+   int64_t cs_nsec;
+   u64 csnow_begin;
+   u64 csnow_end;
+   u64 delta;
+
+   cs = smp_load_acquire(_verify_work_cs); // pairs with 
release
+   if (WARN_ON_ONCE(!cs))
+   return;
+   pr_warn("Checking clocksource %s synchronization from CPU %d.\n",
+   cs->name, smp_processor_id());
+   cpumask_clear(_ahead);
+   cpumask_clear(_behind);
+   csnow_begin = cs->read(cs);
+   smp_call_function(clocksource_verify_one_cpu, cs, 1);
+   csnow_end = cs->read(cs);
+   for_each_online_cpu(cpu) {
+   if (cpu == smp_processor_id())
+   continue;
+   delta = (per_cpu(csnow_mid, cpu) - csnow_begin) & cs->mask;
+   if ((s64)delta < 0)
+   cpumask_set_cpu(cpu, _behind);
+   delta = (csnow_end - per_cpu(csnow_mid, cpu)) & cs->mask;
+   if ((s64)delta < 0)
+   cpumask_set_cpu(cpu, _ahead);
+   }
+   if (!cpumask_empty(_ahead))
+   pr_warn("CPUs %*pbl ahead of CPU %d for clocksource 
%s.\n",
+   cpumask_pr_args(_ahead),
+   

[PATCH v7 clocksource 2/5] clocksource: Retry clock read if long delays detected

2021-04-02 Thread paulmck
From: "Paul E. McKenney" 

When the clocksource watchdog marks a clock as unstable, this might
be due to that clock being unstable or it might be due to delays that
happen to occur between the reads of the two clocks.  Yes, interrupts are
disabled across those two reads, but there are no shortage of things that
can delay interrupts-disabled regions of code ranging from SMI handlers
to vCPU preemption.  It would be good to have some indication as to why
the clock was marked unstable.

This commit therefore re-reads the watchdog clock on either side of
the read from the clock under test.  If the watchdog clock shows an
excessive time delta between its pair of reads, the reads are retried.
The maximum number of retries is specified by a new kernel boot
parameter clocksource.max_read_retries, which defaults to three, that
is, up to four reads, one initial and up to three retries.  If retries
were required, a message is printed on the console.  If the number of
retries is exceeded, the clock under test will be marked unstable.
However, the probability of this happening due to various sorts of
delays is quite small.  In addition, the reason (clock-read delays)
for the unstable marking will be apparent.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
Reported-by: Chris Mason 
[ paulmck: Per-clocksource retries per Neeraj Upadhyay feedback. ]
[ paulmck: Don't reset injectfail per Neeraj Upadhyay feedback. ]
Signed-off-by: Paul E. McKenney 
---
 kernel/time/clocksource.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 4be4391..3f734c6 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -124,6 +124,7 @@ static void __clocksource_change_rating(struct clocksource 
*cs, int rating);
  */
 #define WATCHDOG_INTERVAL (HZ >> 1)
 #define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
+#define WATCHDOG_MAX_SKEW (NSEC_PER_SEC >> 6)
 
 static void clocksource_watchdog_work(struct work_struct *work)
 {
@@ -213,9 +214,10 @@ static void clocksource_watchdog_inject_delay(void)
 static void clocksource_watchdog(struct timer_list *unused)
 {
struct clocksource *cs;
-   u64 csnow, wdnow, cslast, wdlast, delta;
-   int64_t wd_nsec, cs_nsec;
+   u64 csnow, wdnow, wdagain, cslast, wdlast, delta;
+   int64_t wd_nsec, wdagain_nsec, wderr_nsec = 0, cs_nsec;
int next_cpu, reset_pending;
+   int nretries;
 
spin_lock(_lock);
if (!watchdog_running)
@@ -224,6 +226,7 @@ static void clocksource_watchdog(struct timer_list *unused)
reset_pending = atomic_read(_reset_pending);
 
list_for_each_entry(cs, _list, wd_list) {
+   nretries = 0;
 
/* Clocksource already marked unstable? */
if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
@@ -232,11 +235,23 @@ static void clocksource_watchdog(struct timer_list 
*unused)
continue;
}
 
+retry:
local_irq_disable();
-   csnow = cs->read(cs);
-   clocksource_watchdog_inject_delay();
wdnow = watchdog->read(watchdog);
+   clocksource_watchdog_inject_delay();
+   csnow = cs->read(cs);
+   wdagain = watchdog->read(watchdog);
local_irq_enable();
+   delta = clocksource_delta(wdagain, wdnow, watchdog->mask);
+   wdagain_nsec = clocksource_cyc2ns(delta, watchdog->mult, 
watchdog->shift);
+   if (wdagain_nsec < 0 || wdagain_nsec > WATCHDOG_MAX_SKEW) {
+   wderr_nsec = wdagain_nsec;
+   if (nretries++ < max_read_retries)
+   goto retry;
+   }
+   if (nretries)
+   pr_warn("timekeeping watchdog on CPU%d: %s read-back 
delay of %lldns, attempt %d\n",
+   smp_processor_id(), watchdog->name, wderr_nsec, 
nretries);
 
/* Clocksource initialized ? */
if (!(cs->flags & CLOCK_SOURCE_WATCHDOG) ||
-- 
2.9.5



[PATCH v7 clocksource 1/5] clocksource: Provide module parameters to inject delays in watchdog

2021-04-02 Thread paulmck
From: "Paul E. McKenney" 

When the clocksource watchdog marks a clock as unstable, this might be due
to that clock being unstable or it might be due to delays that happen to
occur between the reads of the two clocks.  Yes, interrupts are disabled
across those two reads, but there are no shortage of things that can
delay interrupts-disabled regions of code ranging from SMI handlers to
vCPU preemption.  It would be good to have some indication as to why
the clock was marked unstable.

The first step is a way of injecting such delays, and this
commit therefore provides a clocksource.inject_delay_freq and
clocksource.inject_delay_run kernel boot parameters that specify that
sufficient delay be injected to cause the clocksource_watchdog()
function to mark a clock unstable.  This delay is injected every
Nth set of M calls to clocksource_watchdog(), where N is the value
specified for the inject_delay_freq boot parameter and M is the value
specified for the inject_delay_run boot parameter.  Values of zero or
less for either parameter disable delay injection, and the default for
clocksource.inject_delay_freq is zero, that is, disabled.  The default for
clocksource.inject_delay_run is the value one, that is single-call runs.

This facility is intended for diagnostic use only, and should be avoided
on production systems.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
[ paulmck: Apply Rik van Riel feedback. ]
Reported-by: Chris Mason 
Signed-off-by: Paul E. McKenney 
---
 Documentation/admin-guide/kernel-parameters.txt | 22 
 kernel/time/clocksource.c   | 27 +
 2 files changed, 49 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 0454572..fc57952 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -583,6 +583,28 @@
loops can be debugged more effectively on production
systems.
 
+   clocksource.inject_delay_freq= [KNL]
+   Number of runs of calls to clocksource_watchdog()
+   before delays are injected between reads from the
+   two clocksources.  Values less than or equal to
+   zero disable this delay injection.  These delays
+   can cause clocks to be marked unstable, so use
+   of this parameter should therefore be avoided on
+   production systems.  Defaults to zero (disabled).
+
+   clocksource.inject_delay_run= [KNL]
+   Run lengths of clocksource_watchdog() delay
+   injections.  Specifying the value 8 will result
+   in eight consecutive delays followed by eight
+   times the value specified for inject_delay_freq
+   of consecutive non-delays.
+
+   clocksource.max_read_retries= [KNL]
+   Number of clocksource_watchdog() retries due to
+   external delays before the clock will be marked
+   unstable.  Defaults to three retries, that is,
+   four attempts to read the clock under test.
+
clearcpuid=BITNUM[,BITNUM...] [X86]
Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a..4be4391 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -14,6 +14,7 @@
 #include  /* for spin_unlock_irq() using preempt_count() m68k */
 #include 
 #include 
+#include 
 
 #include "tick-internal.h"
 #include "timekeeping_internal.h"
@@ -184,6 +185,31 @@ void clocksource_mark_unstable(struct clocksource *cs)
spin_unlock_irqrestore(_lock, flags);
 }
 
+static int inject_delay_freq;
+module_param(inject_delay_freq, int, 0644);
+static int inject_delay_run = 1;
+module_param(inject_delay_run, int, 0644);
+static int max_read_retries = 3;
+module_param(max_read_retries, int, 0644);
+
+static void clocksource_watchdog_inject_delay(void)
+{
+   int i;
+   static int injectfail = -1;
+
+   if (inject_delay_freq <= 0 || inject_delay_run <= 0)
+   return;
+   if (injectfail < 0 || injectfail > INT_MAX / 2)
+   injectfail = inject_delay_run;
+   if (!(++injectfail / inject_delay_run % inject_delay_freq)) {
+   pr_warn("%s(): Injecting delay.\n", __func__);
+   for (i = 0; i < 2 * WATCHDOG_THRESHOLD / NSEC_PER_MSEC; i++)
+   udelay(1000);
+   pr_warn("%s(): Done injecting delay.\n", __func__);
+   }
+   WARN_ON_ONCE(injectfail < 0);
+}
+
 static void 

[PATCH v7 clocksource] Do not mark clocks unstable due to delays for v5.13

2021-04-02 Thread Paul E. McKenney
Hello!

If there is a sufficient delay between reading the watchdog clock and the
clock under test, the clock under test will be marked unstable through no
fault of its own.  This series checks for this, doing limited retries
to get a good set of clock reads.  If the clock is marked unstable
and is marked as being per-CPU, cross-CPU synchronization is checked.
This series also provides delay injection, which may be enabled via
kernel boot parameters to test the checking for delays.

Note that "sufficient delay" can be provided by SMIs, NMIs, and of course
vCPU preemption.

1.  Provide module parameters to inject delays in watchdog.

2.  Retry clock read if long delays detected.

3.  Check per-CPU clock synchronization when marked unstable.

4.  Provide a module parameter to fuzz per-CPU clock checking.

5.  Do pairwise clock-desynchronization checking.

Changes since v7:

o   Fix embarrassing git-format-patch operator error.

Changes since v5:

o   Rebased to v5.12-rc5.

Changes since v4:

o   Rebased to v5.12-rc1.

Changes since v3:

o   Rebased to v5.11.

o   Apply Randy Dunlap feedback.

Changes since v2:

o   Rebased to v5.11-rc6.

o   Updated Cc: list.

Changes since v1:

o   Applied feedback from Rik van Riel.

o   Rebased to v5.11-rc3.

o   Stripped "RFC" from the subject lines.

Thanx, Paul



 Documentation/admin-guide/kernel-parameters.txt |   38 +
 arch/x86/kernel/kvmclock.c  |2 
 arch/x86/kernel/tsc.c   |3 
 include/linux/clocksource.h |2 
 kernel/time/clocksource.c   |  174 +---
 5 files changed, 195 insertions(+), 24 deletions(-)


Re: [GIT PULL] Power management fixes for v5.12-rc6

2021-04-02 Thread pr-tracker-bot
The pull request you sent on Fri, 2 Apr 2021 18:01:42 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm-5.12-rc6

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9314a0e9c711b0c092158ee9e0ed24d5ea25c90a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] tracing: Fix stack trace event size

2021-04-02 Thread pr-tracker-bot
The pull request you sent on Fri, 2 Apr 2021 09:33:15 -0400:

> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git 
> trace-v5.12-rc5-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/05de45383bd134fcb2b7d70d35ebb0bb50b5e4aa

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [PATCH v6 clocksource] Do not mark clocks unstable dueclocksource: Provide module parameters to inject delays in watchdog

2021-04-02 Thread Paul E. McKenney
On Sat, Apr 03, 2021 at 12:22:40AM +0200, Thomas Gleixner wrote:
> On Fri, Apr 02 2021 at 13:31, paulmck wrote:
> 
> The subsystem prefix does not parse:
> 
> [PATCH v6 clocksource] Do not mark clocks unstable dueclocksource: 
> Provide module parameters to inject delays in watchdog

Where is that brown bag when I need it?  :-/

> I look at the actual code changes after the easter break.

I will resend a clean copy as a reply to your message.

Thanx, Paul


[PATCH v3 12/12] drm/panel: panel-simple: Use runtime pm to avoid excessive unprepare / prepare

2021-04-02 Thread Douglas Anderson
Unpreparing and re-preparing a panel can be a really heavy
operation. Panels datasheets often specify something on the order of
500ms as the delay you should insert after turning off the panel
before turning it on again. In addition, turning on a panel can have
delays on the order of 100ms - 200ms before the panel will assert HPD
(AKA "panel ready"). The above means that we should avoid turning a
panel off if we're going to turn it on again shortly.

The above becomes a problem when we want to read the EDID of a
panel. The way that ordering works is that userspace wants to read the
EDID of the panel _before_ fully enabling it so that it can set the
initial mode correctly. However, we can't read the EDID until we power
it up. This leads to code that does this dance (like
ps8640_bridge_get_edid()):

1. When userspace requests EDID / the panel modes (through an ioctl),
   we power on the panel just enough to read the EDID and then power
   it off.
2. Userspace then turns the panel on.

There's likely not much time between step #1 and #2 and so we want to
avoid powering the panel off and on again between those two steps.

Let's use Runtime PM to help us. We'll move the existing prepare() and
unprepare() to be runtime resume() and runtime suspend(). Now when we
want to prepare() or unprepare() we just increment or decrement the
refcount. We'll default to a 1 second autosuspend delay which seems
sane given the typical delays we see for panels.

A few notes:
- It seems the existing unprepare() and prepare() are defined to be
  no-ops if called extra times. We'll preserve that behavior.
- This is a slight change in the ABI of simple panel. If something was
  absolutely relying on the unprepare() to happen instantly that
  simply won't be the case anymore. I'm not aware of anyone relying on
  that behavior, but if there is someone then we'll need to figure out
  how to enable (or disable) this new delayed behavior selectively.
- In order for this to work we now have a hard dependency on
  "PM". From memory this is a legit thing to assume these days and we
  don't have to find some fallback to keep working if someone wants to
  build their system without "PM".

Signed-off-by: Douglas Anderson 
---

(no changes since v1)

 drivers/gpu/drm/panel/Kconfig|  1 +
 drivers/gpu/drm/panel/panel-simple.c | 93 +---
 2 files changed, 73 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 4894913936e9..ef87d92cdf49 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -80,6 +80,7 @@ config DRM_PANEL_SIMPLE
tristate "support for simple panels"
depends on OF
depends on BACKLIGHT_CLASS_DEVICE
+   depends on PM
select VIDEOMODE_HELPERS
help
  DRM panel driver for dumb panels that need at most a regulator and
diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index be312b5c04dd..6b22872b3281 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -175,6 +176,8 @@ struct panel_simple {
bool enabled;
bool no_hpd;
 
+   bool prepared;
+
ktime_t prepared_time;
ktime_t unprepared_time;
 
@@ -334,19 +337,31 @@ static int panel_simple_disable(struct drm_panel *panel)
return 0;
 }
 
+static int panel_simple_suspend(struct device *dev)
+{
+   struct panel_simple *p = dev_get_drvdata(dev);
+
+   gpiod_set_value_cansleep(p->enable_gpio, 0);
+   regulator_disable(p->supply);
+   p->unprepared_time = ktime_get();
+
+   return 0;
+}
+
 static int panel_simple_unprepare(struct drm_panel *panel)
 {
struct panel_simple *p = to_panel_simple(panel);
+   int ret;
 
-   if (p->prepared_time == 0)
+   /* Unpreparing when already unprepared is a no-op */
+   if (!p->prepared)
return 0;
 
-   gpiod_set_value_cansleep(p->enable_gpio, 0);
-
-   regulator_disable(p->supply);
-
-   p->prepared_time = 0;
-   p->unprepared_time = ktime_get();
+   pm_runtime_mark_last_busy(panel->dev);
+   ret = pm_runtime_put_autosuspend(panel->dev);
+   if (ret < 0)
+   return ret;
+   p->prepared = false;
 
return 0;
 }
@@ -376,22 +391,19 @@ static int panel_simple_get_hpd_gpio(struct device *dev,
return 0;
 }
 
-static int panel_simple_prepare_once(struct drm_panel *panel)
+static int panel_simple_prepare_once(struct panel_simple *p)
 {
-   struct panel_simple *p = to_panel_simple(panel);
+   struct device *dev = p->base.dev;
unsigned int delay;
int err;
int hpd_asserted;
unsigned long hpd_wait_us;
 
-   if (p->prepared_time != 0)
-   return 0;
-
panel_simple_wait(p->unprepared_time, p->desc->delay.unprepare);
 

[PATCH v3 11/12] drm/bridge: ti-sn65dsi86: Print an error if we fallback to panel modes

2021-04-02 Thread Douglas Anderson
Now that we can properly read the EDID for modes there should be no
reason to fallback to the fixed modes that our downstream panel driver
provides us. Let's make that clear by:
- Putting an error message in the logs if we fall back.
- Putting a comment in saying what's going on.

Signed-off-by: Douglas Anderson 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index fb50f9f95b0f..3b61898cf9cb 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -303,6 +303,13 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
return num;
}
 
+   /*
+* Ideally this should never happen and we could remove the fallback
+* but let's preserve old behavior.
+*/
+   DRM_DEV_ERROR(pdata->dev,
+ "Failed to read EDID; falling back to panel modes");
+
 exit:
return drm_panel_get_modes(pdata->panel, connector);
 }
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 10/12] drm/bridge: ti-sn65dsi86: Read the EDID only if refclk was provided

2021-04-02 Thread Douglas Anderson
Though I don't have access to any hardware that uses ti-sn65dsi86 and
_doesn't_ provide a "refclk", I believe that we'll have trouble
reading the EDID at bootup in that case. Specifically I believe that
if there's no "refclk" we need the MIPI source clock to be active
before we can successfully read the EDID. My evidence here is that, in
testing, I couldn't read the EDID until I turned on the DPPLL in the
bridge chip and that the DPPLL needs the input clock to be active.

Since this is hard to support, let's punt trying to read the EDID if
there's no "refclk".

I don't believe there are any users of the ti-sn65dsi86 bridge chip
that _don't_ use "refclk". The bridge chip is _very_ inflexible in
that mode. The only time I've seen that mode used was for some really
early prototype hardware that was thrown in the e-waste bin years ago
when we realized how inflexible it was.

Even if someone is using the bridge chip without the "refclk" they're
in no worse shape than they were before the (fairly recent) commit
58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over DDC").

Signed-off-by: Douglas Anderson 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index a76cac93cb46..fb50f9f95b0f 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -275,6 +275,18 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
bool was_enabled;
int num;
 
+   /*
+* Don't try to read the EDID if no refclk. In theory it is possible
+* to make this work but it's tricky. I believe that we need to get
+* our upstream MIPI source to provide a pixel clock before we can
+* do AUX transations but we need to be able to read the EDID before
+* we've picked a display mode. The bridge is already super limited
+* if you try to use it without a refclk so presumably limiting to
+* the fixed modes our downstream panel reports is fine.
+*/
+   if (!pdata->refclk)
+   goto exit;
+
if (!edid) {
was_enabled = pdata->pre_enabled;
 
@@ -291,6 +303,7 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
return num;
}
 
+exit:
return drm_panel_get_modes(pdata->panel, connector);
 }
 
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 09/12] drm/bridge: ti-sn65dsi86: Fail aux transfers right away if not powered

2021-04-02 Thread Douglas Anderson
If the bridge (and panel) haven't been powered on then AUX transfers
just won't work. Let's just fail them instantly.

Signed-off-by: Douglas Anderson 
---
If the patch ("drm/panel: panel-simple: Use runtime pm to avoid
excessive unprepare / prepare") is accepted then we could consider
actually powering the panel on instead of failing the
transfer. However, without that patch the overhead would just be too
much since we need to do several AUX transfers for a single EDID read
and powering up and down each time would just be too much.

Changes in v3:
- ("Fail aux transfers right away if not powered") split out for v3.

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 543590801a8e..a76cac93cb46 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -896,6 +896,13 @@ static ssize_t ti_sn_aux_transfer(struct drm_dp_aux *aux,
int ret;
u8 addr_len[SN_AUX_LENGTH_REG + 1 - SN_AUX_ADDR_19_16_REG];
 
+   /*
+* Things just won't work if the panel isn't powered. Return failure
+* right away.
+*/
+   if (!pdata->pre_enabled)
+   return -EIO;
+
if (len > SN_AUX_MAX_PAYLOAD_BYTES)
return -EINVAL;
 
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 08/12] drm/bridge: ti-sn65dsi86: Power things properly for reading the EDID

2021-04-02 Thread Douglas Anderson
eDP panels won't provide their EDID unless they're powered on. Let's
chain a power-on before we read the EDID. This roughly matches what
was done in 'parade-ps8640.c'.

NOTE: The old code attempted to call pm_runtime_get_sync() before
reading the EDID. While that was enough to power the bridge chip on,
it wasn't enough to talk to the panel for two reasons:
1. Since we never ran the bridge chip's pre-enable then we never set
   the bit to ignore HPD. This meant the bridge chip didn't even _try_
   to go out on the bus and communicate with the panel.
2. Even if we fixed things to ignore HPD, the EDID still wouldn't read
   if the panel wasn't on.

ALSO NOTE: Without the future patch ("drm/panel: panel-simple: Use
runtime pm to avoid excessive unprepare / prepare") there will be boot
speed implications here. Specifically we'll power the panel on to read
the EDID, then fully off. Then we'll likely have to wait the minimum
time between power off and power on.

Fixes: 58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over DDC")
Signed-off-by: Douglas Anderson 
---

Changes in v3:
- Rebased now that we're not moving EDID caching to the core.
- Separating out patch to block AUX channel when not powered.
- Added note about boot speed implications.

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 6390bc58f29a..543590801a8e 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -129,6 +129,7 @@
  * @dp_lanes: Count of dp_lanes we're using.
  * @ln_assign:Value to program to the LN_ASSIGN register.
  * @ln_polrs: Value for the 4-bit LN_POLRS field of SN_ENH_FRAME_REG.
+ * @pre_enabled:  If true then pre_enable() has run.
  *
  * @gchip:If we expose our GPIOs, this is used.
  * @gchip_output: A cache of whether we've set GPIOs to output.  This
@@ -157,6 +158,7 @@ struct ti_sn_bridge {
int dp_lanes;
u8  ln_assign;
u8  ln_polrs;
+   boolpre_enabled;
 
 #if defined(CONFIG_OF_GPIO)
struct gpio_chipgchip;
@@ -270,12 +272,17 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
 {
struct ti_sn_bridge *pdata = connector_to_ti_sn_bridge(connector);
struct edid *edid = pdata->edid;
+   bool was_enabled;
int num;
 
if (!edid) {
-   pm_runtime_get_sync(pdata->dev);
+   was_enabled = pdata->pre_enabled;
+
+   if (!was_enabled)
+   drm_bridge_chain_pre_enable(>bridge);
edid = pdata->edid = drm_get_edid(connector, >aux.ddc);
-   pm_runtime_put(pdata->dev);
+   if (!was_enabled)
+   drm_bridge_chain_post_disable(>bridge);
}
 
if (edid && drm_edid_is_valid(edid)) {
@@ -846,12 +853,16 @@ static void ti_sn_bridge_pre_enable(struct drm_bridge 
*bridge)
   HPD_DISABLE);
 
drm_panel_prepare(pdata->panel);
+
+   pdata->pre_enabled = true;
 }
 
 static void ti_sn_bridge_post_disable(struct drm_bridge *bridge)
 {
struct ti_sn_bridge *pdata = bridge_to_ti_sn_bridge(bridge);
 
+   pdata->pre_enabled = false;
+
drm_panel_unprepare(pdata->panel);
 
clk_disable_unprepare(pdata->refclk);
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 07/12] drm/bridge: ti-sn65dsi86: Remove extra call: drm_connector_update_edid_property()

2021-04-02 Thread Douglas Anderson
As of commit 5186421cbfe2 ("drm: Introduce epoch counter to
drm_connector") the drm_get_edid() function calls
drm_connector_update_edid_property() for us. There's no reason for us
to call it again.

Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 51db30d573c1..6390bc58f29a 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -270,7 +270,7 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
 {
struct ti_sn_bridge *pdata = connector_to_ti_sn_bridge(connector);
struct edid *edid = pdata->edid;
-   int num, ret;
+   int num;
 
if (!edid) {
pm_runtime_get_sync(pdata->dev);
@@ -279,12 +279,9 @@ static int ti_sn_bridge_connector_get_modes(struct 
drm_connector *connector)
}
 
if (edid && drm_edid_is_valid(edid)) {
-   ret = drm_connector_update_edid_property(connector, edid);
-   if (!ret) {
-   num = drm_add_edid_modes(connector, edid);
-   if (num)
-   return num;
-   }
+   num = drm_add_edid_modes(connector, edid);
+   if (num)
+   return num;
}
 
return drm_panel_get_modes(pdata->panel, connector);
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 06/12] drm/bridge: ti-sn65dsi86: Get rid of the useless detect() function

2021-04-02 Thread Douglas Anderson
If we just leave the detect() function as NULL then the upper layers
assume we're always connected. There's no reason for a stub.

Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index e30460002c48..51db30d573c1 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -306,20 +306,8 @@ static struct drm_connector_helper_funcs 
ti_sn_bridge_connector_helper_funcs = {
.mode_valid = ti_sn_bridge_connector_mode_valid,
 };
 
-static enum drm_connector_status
-ti_sn_bridge_connector_detect(struct drm_connector *connector, bool force)
-{
-   /**
-* TODO: Currently if drm_panel is present, then always
-* return the status as connected. Need to add support to detect
-* device state for hot pluggable scenarios.
-*/
-   return connector_status_connected;
-}
-
 static const struct drm_connector_funcs ti_sn_bridge_connector_funcs = {
.fill_modes = drm_helper_probe_single_connector_modes,
-   .detect = ti_sn_bridge_connector_detect,
.destroy = drm_connector_cleanup,
.reset = drm_atomic_helper_connector_reset,
.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 05/12] drm/bridge: ti-sn65dsi86: Move drm_panel_unprepare() to post_disable()

2021-04-02 Thread Douglas Anderson
We prepared the panel in pre_enable() so we should unprepare it in
post_disable() to match.

This becomes important once we start using pre_enable() and
post_disable() to make sure things are powered on (and then off again)
when reading the EDID.

Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index c006678c9921..e30460002c48 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -452,8 +452,6 @@ static void ti_sn_bridge_disable(struct drm_bridge *bridge)
regmap_write(pdata->regmap, SN_ML_TX_MODE_REG, 0);
/* disable DP PLL */
regmap_write(pdata->regmap, SN_PLL_ENABLE_REG, 0);
-
-   drm_panel_unprepare(pdata->panel);
 }
 
 static u32 ti_sn_bridge_get_dsi_freq(struct ti_sn_bridge *pdata)
@@ -869,6 +867,8 @@ static void ti_sn_bridge_post_disable(struct drm_bridge 
*bridge)
 {
struct ti_sn_bridge *pdata = bridge_to_ti_sn_bridge(bridge);
 
+   drm_panel_unprepare(pdata->panel);
+
clk_disable_unprepare(pdata->refclk);
 
pm_runtime_put_sync(pdata->dev);
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 03/12] drm/bridge: ti-sn65dsi86: Remove incorrectly tagged kerneldoc comment

2021-04-02 Thread Douglas Anderson
A random comment inside a function had "/**" in front of it. That
doesn't make sense. Remove.

Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

(no changes since v1)

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 96fe8f2c0ea9..76f43af6735d 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -788,7 +788,7 @@ static void ti_sn_bridge_enable(struct drm_bridge *bridge)
/* set dsi clk frequency value */
ti_sn_bridge_set_dsi_rate(pdata);
 
-   /**
+   /*
 * The SN65DSI86 only supports ASSR Display Authentication method and
 * this method is enabled by default. An eDP panel must support this
 * authentication method. We need to enable this method in the eDP panel
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 04/12] drm/bridge: ti-sn65dsi86: Reorder remove()

2021-04-02 Thread Douglas Anderson
Let's make the remove() function strictly the reverse of the probe()
function so it's easier to reason about.

This patch was created by code inspection and should move us closer to
a proper remove.

Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

Changes in v3:
- Removed "NOTES" from commit message.

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 76f43af6735d..c006678c9921 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -1315,20 +1315,21 @@ static int ti_sn_bridge_remove(struct i2c_client 
*client)
if (!pdata)
return -EINVAL;
 
-   kfree(pdata->edid);
-   ti_sn_debugfs_remove(pdata);
-
-   of_node_put(pdata->host_node);
-
-   pm_runtime_disable(pdata->dev);
-
if (pdata->dsi) {
mipi_dsi_detach(pdata->dsi);
mipi_dsi_device_unregister(pdata->dsi);
}
 
+   kfree(pdata->edid);
+
+   ti_sn_debugfs_remove(pdata);
+
drm_bridge_remove(>bridge);
 
+   pm_runtime_disable(pdata->dev);
+
+   of_node_put(pdata->host_node);
+
return 0;
 }
 
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 02/12] drm/bridge: ti-sn65dsi86: Simplify refclk handling

2021-04-02 Thread Douglas Anderson
The clock framework makes it simple to deal with an optional clock.
You can call clk_get_optional() and if the clock isn't specified it'll
just return NULL without complaint. It's valid to pass NULL to
enable/disable/prepare/unprepare. Let's make use of this to simplify
things a tiny bit.

Signed-off-by: Douglas Anderson 
Reviewed-by: Robert Foss 
Reviewed-by: Bjorn Andersson 
Reviewed-by: Stephen Boyd 
Reviewed-by: Laurent Pinchart 
Reviewed-by: Andrzej Hajda 
---

(no changes since v2)

Changes in v2:
- Removed 2nd paragraph in commit message.

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 88df4dd0f39d..96fe8f2c0ea9 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -1275,14 +1275,9 @@ static int ti_sn_bridge_probe(struct i2c_client *client,
return ret;
}
 
-   pdata->refclk = devm_clk_get(pdata->dev, "refclk");
-   if (IS_ERR(pdata->refclk)) {
-   ret = PTR_ERR(pdata->refclk);
-   if (ret == -EPROBE_DEFER)
-   return ret;
-   DRM_DEBUG_KMS("refclk not found\n");
-   pdata->refclk = NULL;
-   }
+   pdata->refclk = devm_clk_get_optional(pdata->dev, "refclk");
+   if (IS_ERR(pdata->refclk))
+   return PTR_ERR(pdata->refclk);
 
ret = ti_sn_bridge_parse_dsi_host(pdata);
if (ret)
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 01/12] drm/bridge: Fix the stop condition of drm_bridge_chain_pre_enable()

2021-04-02 Thread Douglas Anderson
The drm_bridge_chain_pre_enable() is not the proper opposite of
drm_bridge_chain_post_disable(). It continues along the chain to
_before_ the starting bridge. Let's fix that.

Fixes: 05193dc38197 ("drm/bridge: Make the bridge chain a double-linked list")
Signed-off-by: Douglas Anderson 
Reviewed-by: Andrzej Hajda 
---

(no changes since v1)

 drivers/gpu/drm/drm_bridge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index 64f0effb52ac..044acd07c153 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -522,6 +522,9 @@ void drm_bridge_chain_pre_enable(struct drm_bridge *bridge)
list_for_each_entry_reverse(iter, >bridge_chain, chain_node) {
if (iter->funcs->pre_enable)
iter->funcs->pre_enable(iter);
+
+   if (iter == bridge)
+   break;
}
 }
 EXPORT_SYMBOL(drm_bridge_chain_pre_enable);
-- 
2.31.0.208.g409f899ff0-goog



[PATCH v3 00/12] drm: Fix EDID reading on ti-sn65dsi86

2021-04-02 Thread Douglas Anderson
The primary goal of this series is to try to properly fix EDID reading
for eDP panels using the ti-sn65dsi86 bridge.

Previously we had a patch that added EDID reading but it turned out
not to work at bootup. This caused some extra churn at bootup as we
tried (and failed) to read the EDID several times and also ended up
forcing us to use the hardcoded mode at boot. With this patch series I
believe EDID reading is reliable at boot now and we never use the
hardcoded mode.

This series is the logical successor to the 3-part series containing
the patch ("drm/bridge: ti-sn65dsi86: Properly get the EDID, but only
if refclk") [1] though only one actual patch is the same between the
two.

This series starts out with some general / obvious fixes and moves on
to some more specific and maybe controversial ones. I wouldn't object
to some of the earlier ones landing if they look ready.

This patch was developed against drm-misc-next on a
sc7180-trogdor-lazor device. To get things booting for me, I had to
use Stephen's patch [2] to keep from crashing but otherwise all the
patches I needed were here.

Primary change between v2 and v3 is to stop doing the EDID caching in
the core. I also added Andrzej's review tags.

[1] 
https://lore.kernel.org/r/20210304155144.3.I60a7fb23ce4589006bc95c64ab8d15c74b876e68@changeid/
[2] 
https://lore.kernel.org/r/161706912161.3012082.17313817257247946...@swboyd.mtv.corp.google.com/

Changes in v3:
- Removed "NOTES" from commit message.
- Rebased now that we're not moving EDID caching to the core.
- Separating out patch to block AUX channel when not powered.
- Added note about boot speed implications.
- ("Fail aux transfers right away if not powered") split out for v3.

Changes in v2:
- Removed 2nd paragraph in commit message.

Douglas Anderson (12):
  drm/bridge: Fix the stop condition of drm_bridge_chain_pre_enable()
  drm/bridge: ti-sn65dsi86: Simplify refclk handling
  drm/bridge: ti-sn65dsi86: Remove incorrectly tagged kerneldoc comment
  drm/bridge: ti-sn65dsi86: Reorder remove()
  drm/bridge: ti-sn65dsi86: Move drm_panel_unprepare() to post_disable()
  drm/bridge: ti-sn65dsi86: Get rid of the useless detect() function
  drm/bridge: ti-sn65dsi86: Remove extra call:
drm_connector_update_edid_property()
  drm/bridge: ti-sn65dsi86: Power things properly for reading the EDID
  drm/bridge: ti-sn65dsi86: Fail aux transfers right away if not powered
  drm/bridge: ti-sn65dsi86: Read the EDID only if refclk was provided
  drm/bridge: ti-sn65dsi86: Print an error if we fallback to panel modes
  drm/panel: panel-simple: Use runtime pm to avoid excessive unprepare /
prepare

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 97 ---
 drivers/gpu/drm/drm_bridge.c  |  3 +
 drivers/gpu/drm/panel/Kconfig |  1 +
 drivers/gpu/drm/panel/panel-simple.c  | 93 +++--
 4 files changed, 134 insertions(+), 60 deletions(-)

-- 
2.31.0.208.g409f899ff0-goog



Re: [PATCH v1 0/2] Add imx8m power domain driver

2021-04-02 Thread Adrien Grassein
Hi,

Le sam. 3 avr. 2021 à 00:10, Adam Ford  a écrit :
>
> On Fri, Apr 2, 2021 at 1:16 PM Adrien Grassein
>  wrote:
> >
> > Le ven. 2 avr. 2021 à 19:58, Abel Vesa  a écrit :
> > >
> > > On 21-04-02 19:48:41, Adrien Grassein wrote:
> > > > Hi,
> > > >
> > > > Le ven. 2 avr. 2021 à 19:42, Abel Vesa  a écrit :
> > > > >
> > > > > On 21-04-02 18:45:04, Adrien Grassein wrote:
> > > > > > Hi,
> > > > > >
> > > > > > this patch et aims to add the support of the i.MX 8 Power Domain 
> > > > > > driver.
> > > > > > Some devices (like usbotg2) can't work without this patch as their
> > > > > > attached power domain are down.
> > > > > >
> > > > > > The original drivr was taken from le imx kernel and aapted to fit 
> > > > > > with
> > > > > > the actual mainline (minor fixes).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > >
> > > > > Big NACK for the whole series.
> > > > >
> > > > > This approach has already been rejected upstream.
> > > >
> > > > So what is the correct approach?
> > > > At this point otg2 node of imx8mm is not working at all (and blocks the 
> > > > whole
> > > > boot of the kernel)
> > > >
> > >
> > > Have a look at this thread:
> > >
> > > https://lkml.org/lkml/2020/4/27/706
> > >
> > Understood, so I will try to update the gpc driver (at least for otg).
>
> Thanks for doing that. I know Lucas tried a few times to get something
> going.  I'm willing to adapt whatever work you do on the Mini toward
> the Nano if you don't have time.
>

NP, the problem here is that I don't have an Nano to test.

> adam
> >
> > > > >
> > > > > Plus, you changed the original author, this work was originally done 
> > > > > by Jacky Bai.
> > > >
> > > > I have not changed this, the original author is not mentioned on the
> > > > original patch.
> > >
> > > Here is the original commit:
> > >
> > > https://github.com/Freescale/linux-fslc/commit/7ebcf5ccf423afe4ccd9c53ef204018b0b653ce0
> > >
> > >
> > > >
> > > > >
> > > > > > Adrien Grassein (2):
> > > > > >   dt-bindings: power: Add documentation for imx8m power domain 
> > > > > > driver
> > > > > >   soc: imx: add Power Domain driver for i.MX8M(M|N|P)
> > > > > >
> > > > > >  .../bindings/power/fsl,imx-power-domain.yaml  |  89 +++
> > > > > >  MAINTAINERS   |  10 +
> > > > > >  drivers/soc/imx/Kconfig   |   7 +
> > > > > >  drivers/soc/imx/Makefile  |   1 +
> > > > > >  drivers/soc/imx/imx8m_pm_domains.c| 233 
> > > > > > ++
> > > > > >  include/dt-bindings/power/imx8mm-power.h  |  21 ++
> > > > > >  include/dt-bindings/power/imx8mn-power.h  |  15 ++
> > > > > >  include/dt-bindings/power/imx8mp-power.h  |  28 +++
> > > > > >  include/soc/imx/imx_sip.h |  12 +
> > > > > >  9 files changed, 416 insertions(+)
> > > > > >  create mode 100644 
> > > > > > Documentation/devicetree/bindings/power/fsl,imx-power-domain.yaml
> > > > > >  create mode 100644 drivers/soc/imx/imx8m_pm_domains.c
> > > > > >  create mode 100644 include/dt-bindings/power/imx8mm-power.h
> > > > > >  create mode 100644 include/dt-bindings/power/imx8mn-power.h
> > > > > >  create mode 100644 include/dt-bindings/power/imx8mp-power.h
> > > > > >  create mode 100644 include/soc/imx/imx_sip.h
> > > > > >
> > > > > > --
> > > > > > 2.25.1
> > > > > >
> > > >
> > > > Thanks,
> >
> > Thanks,
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Thanks,


Re: [PATCH] firmware_loader: Remove unnecessary conversion to bool

2021-04-02 Thread Shuah Khan

On 2/18/21 2:12 AM, Jiapeng Chong wrote:

Fix the following coccicheck warnings:

./tools/testing/selftests/firmware/fw_namespace.c:98:54-59: WARNING:
conversion to bool not needed here.

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
  tools/testing/selftests/firmware/fw_namespace.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/firmware/fw_namespace.c 
b/tools/testing/selftests/firmware/fw_namespace.c
index 5ebc1ae..0e393cb 100644
--- a/tools/testing/selftests/firmware/fw_namespace.c
+++ b/tools/testing/selftests/firmware/fw_namespace.c
@@ -95,7 +95,7 @@ static bool test_fw_in_ns(const char *fw_name, const char 
*sys_path, bool block_
}
if (block_fw_in_parent_ns)
umount("/lib/firmware");
-   return WEXITSTATUS(status) == EXIT_SUCCESS ? true : false;


This looks right to me. test_fw_in_ns() returns true or false and
test_fw_in_ns() callers print appropriate message.

I don't think this patch is necessary.

thanks,
-- Shuah


Re: [PATCH v6 clocksource] Do not mark clocks unstable dueclocksource: Provide module parameters to inject delays in watchdog

2021-04-02 Thread Thomas Gleixner
On Fri, Apr 02 2021 at 13:31, paulmck wrote:

The subsystem prefix does not parse:

[PATCH v6 clocksource] Do not mark clocks unstable dueclocksource: Provide 
module parameters to inject delays in watchdog

I look at the actual code changes after the easter break.

Thanks,

tglx


  1   2   3   4   5   6   7   8   9   >