Re: [PATCH] time: Fix overwrite err unexpected in clock_adjtime32

2021-04-12 Thread Richard Cochran
On Mon, Apr 12, 2021 at 02:52:11PM +, chenjun (AM) wrote:
> 在 2021/4/12 22:20, Richard Cochran 写道:
> > On Mon, Apr 12, 2021 at 12:45:51PM +, Chen Jun wrote:
> >> the correct error is covered by put_old_timex32.
> > 
> > Well, the non-negative return code (TIME_OK, TIME_INS, etc) is
> > clobbered by put_old_timex32().
> > 
> >> Fixes: f1f1d5ebd10f ("posix-timers: Introduce a syscall for clock tuning.")
> > 
> > This is not the correct commit for the "Fixes" tag.  Please find the
> > actual commit that introduced the issue.
> > 
> > In commit f1f1d5ebd10f the code looked like this...
> > 
> > long compat_sys_clock_adjtime(clockid_t which_clock,
> > struct compat_timex __user *utp)
> > {
> > struct timex txc;
> > mm_segment_t oldfs;
> > int err, ret;
> > 
> > err = compat_get_timex(, utp);
> > if (err)
> > return err;
> > 
> > oldfs = get_fs();
> > set_fs(KERNEL_DS);
> > ret = sys_clock_adjtime(which_clock, (struct timex __user *) 
> > );
> > set_fs(oldfs);
> > 
> > err = compat_put_timex(utp, );
> > if (err)
> > return err;
> > 
> > return ret;
> > }

Look at the code ^^^

> The implement of clock_adjtime32 is similar to compat_sys_clock_adjtime. 
> And I think f1f1d5ebd10 introduced the problem actually.

See how 'ret' and 'err' are two separate variables?  It makes a difference.

Thanks,
Richard


Re: [PATCH] time: Fix overwrite err unexpected in clock_adjtime32

2021-04-12 Thread chenjun (AM)
在 2021/4/12 22:20, Richard Cochran 写道:
> On Mon, Apr 12, 2021 at 12:45:51PM +, Chen Jun wrote:
>> the correct error is covered by put_old_timex32.
> 
> Well, the non-negative return code (TIME_OK, TIME_INS, etc) is
> clobbered by put_old_timex32().
> 
>> Fixes: f1f1d5ebd10f ("posix-timers: Introduce a syscall for clock tuning.")
> 
> This is not the correct commit for the "Fixes" tag.  Please find the
> actual commit that introduced the issue.
> 
> In commit f1f1d5ebd10f the code looked like this...
> 
>   long compat_sys_clock_adjtime(clockid_t which_clock,
>   struct compat_timex __user *utp)
>   {
>   struct timex txc;
>   mm_segment_t oldfs;
>   int err, ret;
>   
>   err = compat_get_timex(, utp);
>   if (err)
>   return err;
>   
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
>   ret = sys_clock_adjtime(which_clock, (struct timex __user *) 
> );
>   set_fs(oldfs);
>   
>   err = compat_put_timex(utp, );
>   if (err)
>   return err;
>   
>   return ret;
>   }
> 

f1f1d5ebd10:
Introduce compat_sys_clock_adjtime

62a6fa97684:
rename compat_sys_clock_adjtime to COMPAT_SYSCALL_DEFINE2(clock_adjtime

3a4d44b6162:
move COMPAT_SYSCALL_DEFINE2(clock_adjtime from kernel/compat.c to 
kernel/time/posix-timers.c

8dabe7245bb:
COMPAT_SYSCALL_DEFINE2(clock_adjtime, ..
-> SYSCALL_DEFINE2(clock_adjtime32, ..

The implement of clock_adjtime32 is similar to compat_sys_clock_adjtime. 
And I think f1f1d5ebd10 introduced the problem actually.

> Thanks,
> Richard
> 
> 
> 
>> Signed-off-by: Chen Jun 
>> ---
>>   kernel/time/posix-timers.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
>> index bf540f5a..dd5697d 100644
>> --- a/kernel/time/posix-timers.c
>> +++ b/kernel/time/posix-timers.c
>> @@ -1191,8 +1191,8 @@ SYSCALL_DEFINE2(clock_adjtime32, clockid_t, 
>> which_clock,
>>   
>>  err = do_clock_adjtime(which_clock, );
>>   
>> -if (err >= 0)
>> -err = put_old_timex32(utp, );
>> +if (err >= 0 && put_old_timex32(utp, ))
>> +return -EFAULT;
>>   
>>  return err;
>>   }
>> -- 
>> 2.9.4
>>
> 


-- 
Regards
Chen Jun


Re: [PATCH] time: Fix overwrite err unexpected in clock_adjtime32

2021-04-12 Thread Richard Cochran
On Mon, Apr 12, 2021 at 12:45:51PM +, Chen Jun wrote:
> the correct error is covered by put_old_timex32.

Well, the non-negative return code (TIME_OK, TIME_INS, etc) is
clobbered by put_old_timex32().

> Fixes: f1f1d5ebd10f ("posix-timers: Introduce a syscall for clock tuning.")

This is not the correct commit for the "Fixes" tag.  Please find the
actual commit that introduced the issue.

In commit f1f1d5ebd10f the code looked like this...

long compat_sys_clock_adjtime(clockid_t which_clock,
struct compat_timex __user *utp)
{
struct timex txc;
mm_segment_t oldfs;
int err, ret;

err = compat_get_timex(, utp);
if (err)
return err;

oldfs = get_fs();
set_fs(KERNEL_DS);
ret = sys_clock_adjtime(which_clock, (struct timex __user *) 
);
set_fs(oldfs);

err = compat_put_timex(utp, );
if (err)
return err;

return ret;
}

Thanks,
Richard



> Signed-off-by: Chen Jun 
> ---
>  kernel/time/posix-timers.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
> index bf540f5a..dd5697d 100644
> --- a/kernel/time/posix-timers.c
> +++ b/kernel/time/posix-timers.c
> @@ -1191,8 +1191,8 @@ SYSCALL_DEFINE2(clock_adjtime32, clockid_t, which_clock,
>  
>   err = do_clock_adjtime(which_clock, );
>  
> - if (err >= 0)
> - err = put_old_timex32(utp, );
> + if (err >= 0 && put_old_timex32(utp, ))
> + return -EFAULT;
>  
>   return err;
>  }
> -- 
> 2.9.4
> 


Re: [PATCH RESEND v1 4/4] powerpc/vdso: Add support for time namespaces

2021-04-12 Thread Vincenzo Frascino



On 3/31/21 5:48 PM, Christophe Leroy wrote:
> This patch adds the necessary glue to provide time namespaces.
> 
> Things are mainly copied from ARM64.
> 
> __arch_get_timens_vdso_data() calculates timens vdso data position
> based on the vdso data position, knowing it is the next page in vvar.
> This avoids having to redo the mflr/bcl/mflr/mtlr dance to locate
> the page relative to running code position.
> 
> Signed-off-by: Christophe Leroy 

Reviewed-by: Vincenzo Frascino  # vDSO parts

> ---
>  arch/powerpc/Kconfig |   3 +-
>  arch/powerpc/include/asm/vdso/gettimeofday.h |  10 ++
>  arch/powerpc/include/asm/vdso_datapage.h |   2 -
>  arch/powerpc/kernel/vdso.c   | 116 ---
>  arch/powerpc/kernel/vdso32/vdso32.lds.S  |   2 +-
>  arch/powerpc/kernel/vdso64/vdso64.lds.S  |   2 +-
>  6 files changed, 114 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index c1344c05226c..71daff5f15d5 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -172,6 +172,7 @@ config PPC
>   select GENERIC_CPU_AUTOPROBE
>   select GENERIC_CPU_VULNERABILITIES  if PPC_BARRIER_NOSPEC
>   select GENERIC_EARLY_IOREMAP
> + select GENERIC_GETTIMEOFDAY
>   select GENERIC_IRQ_SHOW
>   select GENERIC_IRQ_SHOW_LEVEL
>   select GENERIC_PCI_IOMAPif PCI
> @@ -179,7 +180,7 @@ config PPC
>   select GENERIC_STRNCPY_FROM_USER
>   select GENERIC_STRNLEN_USER
>   select GENERIC_TIME_VSYSCALL
> - select GENERIC_GETTIMEOFDAY
> + select GENERIC_VDSO_TIME_NS
>   select HAVE_ARCH_AUDITSYSCALL
>   select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
> PPC_RADIX_MMU
>   select HAVE_ARCH_JUMP_LABEL
> diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
> b/arch/powerpc/include/asm/vdso/gettimeofday.h
> index d453e725c79f..e448df1dd071 100644
> --- a/arch/powerpc/include/asm/vdso/gettimeofday.h
> +++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
> @@ -2,6 +2,8 @@
>  #ifndef _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
>  #define _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
>  
> +#include 
> +
>  #ifdef __ASSEMBLY__
>  
>  #include 
> @@ -153,6 +155,14 @@ static __always_inline u64 __arch_get_hw_counter(s32 
> clock_mode,
>  
>  const struct vdso_data *__arch_get_vdso_data(void);
>  
> +#ifdef CONFIG_TIME_NS
> +static __always_inline
> +const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data 
> *vd)
> +{
> + return (void *)vd + PAGE_SIZE;
> +}
> +#endif
> +
>  static inline bool vdso_clocksource_ok(const struct vdso_data *vd)
>  {
>   return true;
> diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
> b/arch/powerpc/include/asm/vdso_datapage.h
> index 3f958ecf2beb..a585c8e538ff 100644
> --- a/arch/powerpc/include/asm/vdso_datapage.h
> +++ b/arch/powerpc/include/asm/vdso_datapage.h
> @@ -107,9 +107,7 @@ extern struct vdso_arch_data *vdso_data;
>   bcl 20, 31, .+4
>  999:
>   mflr\ptr
> -#if CONFIG_PPC_PAGE_SHIFT > 14
>   addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
> -#endif
>   addi\ptr, \ptr, (_vdso_datapage - 999b)@l
>  .endm
>  
> diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
> index b14907209822..717f2c9a7573 100644
> --- a/arch/powerpc/kernel/vdso.c
> +++ b/arch/powerpc/kernel/vdso.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -50,6 +51,12 @@ static union {
>  } vdso_data_store __page_aligned_data;
>  struct vdso_arch_data *vdso_data = _data_store.data;
>  
> +enum vvar_pages {
> + VVAR_DATA_PAGE_OFFSET,
> + VVAR_TIMENS_PAGE_OFFSET,
> + VVAR_NR_PAGES,
> +};
> +
>  static int vdso_mremap(const struct vm_special_mapping *sm, struct 
> vm_area_struct *new_vma,
>  unsigned long text_size)
>  {
> @@ -73,8 +80,12 @@ static int vdso64_mremap(const struct vm_special_mapping 
> *sm, struct vm_area_str
>   return vdso_mremap(sm, new_vma, _end - _start);
>  }
>  
> +static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
> +  struct vm_area_struct *vma, struct vm_fault *vmf);
> +
>  static struct vm_special_mapping vvar_spec __ro_after_init = {
>   .name = "[vvar]",
> + .fault = vvar_fault,
>  };
>  
>  static struct vm_special_mapping vdso32_spec __ro_after_init = {
> @@ -87,6 +98,94 @@ static struct vm_special_mapping vdso64_spec 
> __ro_after_init = {
>   .mremap = vdso64_mremap,
>  };
>  
> +#ifdef CONFIG_TIME

Re: [PATCH RESEND v1 0/4] powerpc/vdso: Add support for time namespaces

2021-04-12 Thread Thomas Gleixner
On Wed, Mar 31 2021 at 16:48, Christophe Leroy wrote:
> [Sorry, resending with complete destination list, I used the wrong script on 
> the first delivery]
>
> This series adds support for time namespaces on powerpc.
>
> All timens selftests are successfull.

If PPC people want to pick up the whole lot, no objections from my side.

Thanks,

tglx


[PATCH] time: Fix overwrite err unexpected in clock_adjtime32

2021-04-12 Thread Chen Jun
the correct error is covered by put_old_timex32.

Fixes: f1f1d5ebd10f ("posix-timers: Introduce a syscall for clock tuning.")
Signed-off-by: Chen Jun 
---
 kernel/time/posix-timers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index bf540f5a..dd5697d 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1191,8 +1191,8 @@ SYSCALL_DEFINE2(clock_adjtime32, clockid_t, which_clock,
 
err = do_clock_adjtime(which_clock, );
 
-   if (err >= 0)
-   err = put_old_timex32(utp, );
+   if (err >= 0 && put_old_timex32(utp, ))
+   return -EFAULT;
 
return err;
 }
-- 
2.9.4



Re: [RESEND] i2c: mediatek: Get device clock-stretch time via dts

2021-04-12 Thread Qii Wang
On Wed, 2021-04-07 at 20:19 +0200, Wolfram Sang wrote:
> > Due to clock stretch, our HW IP cannot meet the ac-timing
> > spec(tSU;STA,tSU;STO). 
> > There isn't a same delay for clock stretching, so we need pass a
> > parameter which can be found through measurement to meet most
> > conditions.
> 
> What about using this existing binding?
> 
> - i2c-scl-internal-delay-ns
> Number of nanoseconds the IP core additionally needs to setup SCL.
> 

I can't see the relationship between "i2c-scl-falling-time-ns" and clock
stretching, is there a parameter related to clock stretching?
If you think both of them will affect the ac-timing of SCL, at this
point, "i2c-scl-falling-time-ns" maybe a good choice.



[PATCH 1/4] USB: serial: f81232: drop time-based drain delay

2021-04-12 Thread Johan Hovold
The f81232 driver now waits for the transmit FIFO to drain during close
so there is no need to keep the time-based drain delay, which would add
up to two seconds on every close for low line speeds.

Fixes: 98405f81036d ("USB: serial: f81232: add tx_empty function")
Signed-off-by: Johan Hovold 
---
 drivers/usb/serial/f81232.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/usb/serial/f81232.c b/drivers/usb/serial/f81232.c
index b4b847dce4bc..a7a7af8d05bf 100644
--- a/drivers/usb/serial/f81232.c
+++ b/drivers/usb/serial/f81232.c
@@ -948,7 +948,6 @@ static int f81232_port_probe(struct usb_serial_port *port)
 
usb_set_serial_port_data(port, priv);
 
-   port->port.drain_delay = 256;
priv->port = port;
 
return 0;
-- 
2.26.3



[PATCH v4 3/6] perf arm-spe: Convert event kernel time to counter value

2021-04-12 Thread Leo Yan
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 7620dcc45940..23714cf0380e 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -669,7 +669,7 @@ static int arm_spe_process_event(struct perf_session 
*session,
}
 
if (sample->time && (sample->time != (u64) -1))
-   timestamp = sample->time;
+   timestamp = perf_time_to_tsc(sample->time, >tc);
else
timestamp = 0;
 
-- 
2.25.1



[PATCH v4 4/6] perf arm-spe: Assign kernel time to synthesized event

2021-04-12 Thread Leo Yan
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 23714cf0380e..c13a89f06ab8 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -234,7 +234,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
struct arm_spe_record *record = >decoder->record;
 
if (!spe->timeless_decoding)
-   sample->time = speq->timestamp;
+   sample->time = tsc_to_perf_time(record->timestamp, >tc);
 
sample->ip = record->from_ip;
sample->cpumode = arm_spe_cpumode(spe, sample->ip);
-- 
2.25.1



[PATCH 5.11 073/210] mac80211: fix time-is-after bug in mlme

2021-04-12 Thread Greg Kroah-Hartman
From: Ben Greear 

commit 7d73cd946d4bc7d44cdc5121b1c61d5d71425dea upstream.

The incorrect timeout check caused probing to happen when it did
not need to happen.  This in turn caused tx performance drop
for around 5 seconds in ath10k-ct driver.  Possibly that tx drop
is due to a secondary issue, but fixing the probe to not happen
when traffic is running fixes the symptom.

Signed-off-by: Ben Greear 
Fixes: 9abf4e49830d ("mac80211: optimize station connection monitor")
Acked-by: Felix Fietkau 
Link: https://lore.kernel.org/r/20210330230749.14097-1-gree...@candelatech.com
Signed-off-by: Johannes Berg 
Signed-off-by: Greg Kroah-Hartman 
---
 net/mac80211/mlme.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -4707,7 +4707,10 @@ static void ieee80211_sta_conn_mon_timer
timeout = sta->rx_stats.last_rx;
timeout += IEEE80211_CONNECTION_IDLE_TIME;
 
-   if (time_is_before_jiffies(timeout)) {
+   /* If timeout is after now, then update timer to fire at
+* the later date, but do not actually probe at this time.
+*/
+   if (time_is_after_jiffies(timeout)) {
mod_timer(>conn_mon_timer, round_jiffies_up(timeout));
return;
}




[PATCH 5.11 027/210] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS

2021-04-12 Thread Greg Kroah-Hartman
From: Mike Marciniszyn 

commit 5de61a47eb9064cbbc5f3360d639e8e34a690a54 upstream.

A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000
  PGD 0 P4D 0
  Oops:  1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE - - - 
4.18.0-240.el8.x86_64 #1
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS 
SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:99aa0845f9f0 EFLAGS: 00010246
  RAX:  RBX: 8d5a6fc18000 RCX: 0048
  RDX:  RSI: c06336f0 RDI: 8d5a8fa67750
  RBP: 0079 R08: 000f R09: 
  R10:  R11: 0001 R12: c06336f0
  R13: 00a0 R14: 8d5a6fc18000 R15: 0003
  FS: 7fec137a5980() GS:8d5a9fa8() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2:  CR3: 000a04b48002 CR4: 001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev")
Link: 
https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessan...@cornelisnetworks.com
Cc: sta...@vger.kernel.org
Signed-off-by: Mike Marciniszyn 
Signed-off-by: Dennis Dalessandro 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/infiniband/hw/hfi1/affinity.c  |   21 +
 drivers/infiniband/hw/hfi1/hfi.h   |1 +
 drivers/infiniband/hw/hfi1/init.c  |   10 +-
 drivers/infiniband/hw/hfi1/netdev_rx.c |3 +--
 4 files changed, 16 insertions(+), 19 deletions(-)

--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -632,22 +632,11 @@ static void _dev_comp_vect_cpu_mask_clea
  */
 int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 {
-   int node = pcibus_to_node(dd->pcidev->bus);
struct hfi1_affinity_node *entry;
const struct cpumask *local_mask;
int curr_cpu, possible, i, ret;
bool new_entry = false;
 
-   /*
-* If the BIOS does not have the NUMA node information set, select
-* NUMA 0 so we get consistent performance.
-*/
-   if (node < 0) {
-   dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be 
affected\n");
-   node = 0;
-   }
-   dd->node = node;
-
local_mask = cpumask_of_node(dd->node);
if (cpumask_first(local_mask) >= nr_cpu_ids)
local_mask = topology_core_cpumask(0);
@@ -660,7 +649,7 @@ int hfi1_dev_affinity_init(struct hfi1_d
 * create an entry in the global affinity structure and initialize it.
 */
if (!entry) {
-   entry = node_affinity_allocate(node);
+   entry = node_affinity_allocate(dd->node);
if (!entry) {
dd_dev_err(dd,
   "Unable to allocate global affinity node\n");
@@ -751,6 +740,7 @@ int hfi1_dev_affinity_init(struct hfi1_d
if (new_entry)
node_affinity_add_tail(entry);
 
+   dd->affinity_entry = entry;
mutex_unlock(_affinity.lock);
 
return 0;
@@ -766,10 +756,9 @@ void hfi1_dev_affinity_clean_up(struct h
 {
struct hfi1_affinity_node *entry;
 
-   if (dd->node < 0)
-   return;
-
mutex_lock(_affinity.lock);
+   if (!dd->affinity_entry)
+   goto unlock;
entry = node_affinity_lookup(dd->node);
if (!entry)
goto unlock;
@@ -780,8 +769,8 @@ void hfi1_dev_affinity_clean_up(struct h
 */
_dev_comp_vect_cpu_mask_clean_up(dd, entry);
 unlock:
+   

[PATCH 5.10 066/188] mac80211: fix time-is-after bug in mlme

2021-04-12 Thread Greg Kroah-Hartman
From: Ben Greear 

commit 7d73cd946d4bc7d44cdc5121b1c61d5d71425dea upstream.

The incorrect timeout check caused probing to happen when it did
not need to happen.  This in turn caused tx performance drop
for around 5 seconds in ath10k-ct driver.  Possibly that tx drop
is due to a secondary issue, but fixing the probe to not happen
when traffic is running fixes the symptom.

Signed-off-by: Ben Greear 
Fixes: 9abf4e49830d ("mac80211: optimize station connection monitor")
Acked-by: Felix Fietkau 
Link: https://lore.kernel.org/r/20210330230749.14097-1-gree...@candelatech.com
Signed-off-by: Johannes Berg 
Signed-off-by: Greg Kroah-Hartman 
---
 net/mac80211/mlme.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -4660,7 +4660,10 @@ static void ieee80211_sta_conn_mon_timer
timeout = sta->rx_stats.last_rx;
timeout += IEEE80211_CONNECTION_IDLE_TIME;
 
-   if (time_is_before_jiffies(timeout)) {
+   /* If timeout is after now, then update timer to fire at
+* the later date, but do not actually probe at this time.
+*/
+   if (time_is_after_jiffies(timeout)) {
mod_timer(>conn_mon_timer, round_jiffies_up(timeout));
return;
}




[PATCH 5.10 020/188] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS

2021-04-12 Thread Greg Kroah-Hartman
From: Mike Marciniszyn 

commit 5de61a47eb9064cbbc5f3360d639e8e34a690a54 upstream.

A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000
  PGD 0 P4D 0
  Oops:  1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE - - - 
4.18.0-240.el8.x86_64 #1
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS 
SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:99aa0845f9f0 EFLAGS: 00010246
  RAX:  RBX: 8d5a6fc18000 RCX: 0048
  RDX:  RSI: c06336f0 RDI: 8d5a8fa67750
  RBP: 0079 R08: 000f R09: 
  R10:  R11: 0001 R12: c06336f0
  R13: 00a0 R14: 8d5a6fc18000 R15: 0003
  FS: 7fec137a5980() GS:8d5a9fa8() knlGS:
  CS: 0010 DS:  ES:  CR0: 80050033
  CR2:  CR3: 000a04b48002 CR4: 001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev")
Link: 
https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessan...@cornelisnetworks.com
Cc: sta...@vger.kernel.org
Signed-off-by: Mike Marciniszyn 
Signed-off-by: Dennis Dalessandro 
Signed-off-by: Jason Gunthorpe 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/infiniband/hw/hfi1/affinity.c  |   21 +
 drivers/infiniband/hw/hfi1/hfi.h   |1 +
 drivers/infiniband/hw/hfi1/init.c  |   10 +-
 drivers/infiniband/hw/hfi1/netdev_rx.c |3 +--
 4 files changed, 16 insertions(+), 19 deletions(-)

--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -632,22 +632,11 @@ static void _dev_comp_vect_cpu_mask_clea
  */
 int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 {
-   int node = pcibus_to_node(dd->pcidev->bus);
struct hfi1_affinity_node *entry;
const struct cpumask *local_mask;
int curr_cpu, possible, i, ret;
bool new_entry = false;
 
-   /*
-* If the BIOS does not have the NUMA node information set, select
-* NUMA 0 so we get consistent performance.
-*/
-   if (node < 0) {
-   dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be 
affected\n");
-   node = 0;
-   }
-   dd->node = node;
-
local_mask = cpumask_of_node(dd->node);
if (cpumask_first(local_mask) >= nr_cpu_ids)
local_mask = topology_core_cpumask(0);
@@ -660,7 +649,7 @@ int hfi1_dev_affinity_init(struct hfi1_d
 * create an entry in the global affinity structure and initialize it.
 */
if (!entry) {
-   entry = node_affinity_allocate(node);
+   entry = node_affinity_allocate(dd->node);
if (!entry) {
dd_dev_err(dd,
   "Unable to allocate global affinity node\n");
@@ -751,6 +740,7 @@ int hfi1_dev_affinity_init(struct hfi1_d
if (new_entry)
node_affinity_add_tail(entry);
 
+   dd->affinity_entry = entry;
mutex_unlock(_affinity.lock);
 
return 0;
@@ -766,10 +756,9 @@ void hfi1_dev_affinity_clean_up(struct h
 {
struct hfi1_affinity_node *entry;
 
-   if (dd->node < 0)
-   return;
-
mutex_lock(_affinity.lock);
+   if (!dd->affinity_entry)
+   goto unlock;
entry = node_affinity_lookup(dd->node);
if (!entry)
goto unlock;
@@ -780,8 +769,8 @@ void hfi1_dev_affinity_clean_up(struct h
 */
_dev_comp_vect_cpu_mask_clean_up(dd, entry);
 unlock:
+   

[tip: core/rcu] rcutorture: Make TREE03 use real-time tree.use_softirq setting

2021-04-11 Thread tip-bot2 for Paul E. McKenney
The following commit has been merged into the core/rcu branch of tip:

Commit-ID: e2b949d54392ad890bb10fb8954d967e2fcd7503
Gitweb:
https://git.kernel.org/tip/e2b949d54392ad890bb10fb8954d967e2fcd7503
Author:Paul E. McKenney 
AuthorDate:Thu, 14 Jan 2021 16:11:04 -08:00
Committer: Paul E. McKenney 
CommitterDate: Mon, 08 Mar 2021 14:21:40 -08:00

rcutorture: Make TREE03 use real-time tree.use_softirq setting

TREE03 tests RCU priority boosting, which is a real-time feature.
It would also be good if it tested something closer to what is
actually used by the real-time folks.  This commit therefore adds
tree.use_softirq=0 to the TREE03 kernel boot parameters in TREE03.boot.

Signed-off-by: Paul E. McKenney 
---
 tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
index 1c21894..64f864f 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
@@ -4,3 +4,4 @@ rcutree.gp_init_delay=3
 rcutree.gp_cleanup_delay=3
 rcutree.kthread_prio=2
 threadirqs
+tree.use_softirq=0


[PATCH v3 4/6] perf arm-spe: Assign kernel time to synthesized event

2021-04-09 Thread Leo Yan
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index f66e10c62473..ec7df83b50fd 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -234,7 +234,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
struct arm_spe_record *record = >decoder->record;
 
if (!spe->timeless_decoding)
-   sample->time = speq->timestamp;
+   sample->time = tsc_to_perf_time(record->timestamp, >tc);
 
sample->ip = record->from_ip;
sample->cpumode = arm_spe_cpumode(spe, sample->ip);
-- 
2.25.1



[PATCH v3 3/6] perf arm-spe: Convert event kernel time to counter value

2021-04-09 Thread Leo Yan
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index b48816d5c0b4..f66e10c62473 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -669,7 +669,7 @@ static int arm_spe_process_event(struct perf_session 
*session,
}
 
if (sample->time && (sample->time != (u64) -1))
-   timestamp = sample->time;
+   timestamp = perf_time_to_tsc(sample->time, >tc);
else
timestamp = 0;
 
-- 
2.25.1



[PATCH v2] usb: core: reduce power-on-good delay time of root hub

2021-04-09 Thread Chunfeng Yun
Return the exactly delay time given by root hub descriptor,
this helps to reduce resume time etc.

Due to the root hub descriptor is usually provided by the host
controller driver, if there is compatibility for a root hub,
we can fix it easily without affect other root hub

Acked-by: Alan Stern 
Signed-off-by: Chunfeng Yun 
---
v2: remove RFC tag, and add acked-by Alan
---
 drivers/usb/core/hub.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
index 73f4482d833a..22ea1f4f2d66 100644
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -148,8 +148,10 @@ static inline unsigned hub_power_on_good_delay(struct 
usb_hub *hub)
 {
unsigned delay = hub->descriptor->bPwrOn2PwrGood * 2;
 
-   /* Wait at least 100 msec for power to become stable */
-   return max(delay, 100U);
+   if (!hub->hdev->parent) /* root hub */
+   return delay;
+   else /* Wait at least 100 msec for power to become stable */
+   return max(delay, 100U);
 }
 
 static inline int hub_port_debounce_be_connected(struct usb_hub *hub,
-- 
2.18.0



Re: [RFC PATCH] usb: core: reduce power-on-good delay time of root hub

2021-04-09 Thread Alan Stern
On Fri, Apr 09, 2021 at 10:39:07AM +0800, Chunfeng Yun wrote:
> Return the exactly delay time given by root hub descriptor,
> this helps to reduce resume time etc.
> 
> Due to the root hub descriptor is usually provided by the host
> controller driver, if there is compatibility for a root hub,
> we can fix it easily without affect other root hub
> 
> Signed-off-by: Chunfeng Yun 

Acked-by: Alan Stern 

> ---
>  drivers/usb/core/hub.h | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
> index 73f4482d833a..22ea1f4f2d66 100644
> --- a/drivers/usb/core/hub.h
> +++ b/drivers/usb/core/hub.h
> @@ -148,8 +148,10 @@ static inline unsigned hub_power_on_good_delay(struct 
> usb_hub *hub)
>  {
>   unsigned delay = hub->descriptor->bPwrOn2PwrGood * 2;
>  
> - /* Wait at least 100 msec for power to become stable */
> - return max(delay, 100U);
> + if (!hub->hdev->parent) /* root hub */
> + return delay;
> + else /* Wait at least 100 msec for power to become stable */
> + return max(delay, 100U);
>  }
>  
>  static inline int hub_port_debounce_be_connected(struct usb_hub *hub,
> -- 
> 2.18.0
> 


[PATCH v4 3/3] ima: enable loading of build time generated key on .ima keyring

2021-04-09 Thread Nayna Jain
The kernel currently only loads the kernel module signing key onto the
builtin trusted keyring. Load the module signing key onto the IMA keyring
as well.

Signed-off-by: Nayna Jain 
Acked-by: Stefan Berger 
---
 certs/system_certificates.S   | 13 -
 certs/system_keyring.c| 50 ---
 include/keys/system_keyring.h |  7 +
 security/integrity/digsig.c   |  2 ++
 4 files changed, 61 insertions(+), 11 deletions(-)

diff --git a/certs/system_certificates.S b/certs/system_certificates.S
index 8f29058adf93..dcad27ea8527 100644
--- a/certs/system_certificates.S
+++ b/certs/system_certificates.S
@@ -8,9 +8,11 @@
.globl system_certificate_list
 system_certificate_list:
 __cert_list_start:
-#ifdef CONFIG_MODULE_SIG
+__module_cert_start:
+#if defined(CONFIG_MODULE_SIG) || defined(CONFIG_IMA_APPRAISE_MODSIG)
.incbin "certs/signing_key.x509"
 #endif
+__module_cert_end:
.incbin "certs/x509_certificate_list"
 __cert_list_end:
 
@@ -35,3 +37,12 @@ system_certificate_list_size:
 #else
.long __cert_list_end - __cert_list_start
 #endif
+
+   .align 8
+   .globl module_cert_size
+module_cert_size:
+#ifdef CONFIG_64BIT
+   .quad __module_cert_end - __module_cert_start
+#else
+   .long __module_cert_end - __module_cert_start
+#endif
diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 4b693da488f1..2b3ad375ecc1 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -27,6 +27,7 @@ static struct key *platform_trusted_keys;
 
 extern __initconst const u8 system_certificate_list[];
 extern __initconst const unsigned long system_certificate_list_size;
+extern __initconst const unsigned long module_cert_size;
 
 /**
  * restrict_link_to_builtin_trusted - Restrict keyring addition by built in CA
@@ -132,19 +133,11 @@ static __init int system_trusted_keyring_init(void)
  */
 device_initcall(system_trusted_keyring_init);
 
-/*
- * Load the compiled-in list of X.509 certificates.
- */
-static __init int load_system_certificate_list(void)
+static __init int load_cert(const u8 *p, const u8 *end, struct key *keyring)
 {
key_ref_t key;
-   const u8 *p, *end;
size_t plen;
 
-   pr_notice("Loading compiled-in X.509 certificates\n");
-
-   p = system_certificate_list;
-   end = p + system_certificate_list_size;
while (p < end) {
/* Each cert begins with an ASN.1 SEQUENCE tag and must be more
 * than 256 bytes in size.
@@ -159,7 +152,7 @@ static __init int load_system_certificate_list(void)
if (plen > end - p)
goto dodgy_cert;
 
-   key = key_create_or_update(make_key_ref(builtin_trusted_keys, 
1),
+   key = key_create_or_update(make_key_ref(keyring, 1),
   "asymmetric",
   NULL,
   p,
@@ -186,6 +179,43 @@ static __init int load_system_certificate_list(void)
pr_err("Problem parsing in-kernel X.509 certificate list\n");
return 0;
 }
+
+__init int load_module_cert(struct key *keyring)
+{
+   const u8 *p, *end;
+
+   if (!IS_ENABLED(CONFIG_IMA_APPRAISE_MODSIG))
+   return 0;
+
+   pr_notice("Loading compiled-in module X.509 certificates\n");
+
+   p = system_certificate_list;
+   end = p + module_cert_size;
+
+   return load_cert(p, end, keyring);
+}
+
+/*
+ * Load the compiled-in list of X.509 certificates.
+ */
+static __init int load_system_certificate_list(void)
+{
+   const u8 *p, *end;
+   unsigned long size;
+
+   pr_notice("Loading compiled-in X.509 certificates\n");
+
+#ifdef CONFIG_MODULE_SIG
+   p = system_certificate_list;
+   size = system_certificate_list_size;
+#else
+   p = system_certificate_list + module_cert_size;
+   size = system_certificate_list_size - module_cert_size;
+#endif
+
+   end = p + size;
+   return load_cert(p, end, builtin_trusted_keys);
+}
 late_initcall(load_system_certificate_list);
 
 #ifdef CONFIG_SYSTEM_DATA_VERIFICATION
diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h
index fb8b07daa9d1..f954276c616a 100644
--- a/include/keys/system_keyring.h
+++ b/include/keys/system_keyring.h
@@ -16,9 +16,16 @@ extern int restrict_link_by_builtin_trusted(struct key 
*keyring,
const struct key_type *type,
const union key_payload *payload,
struct key *restriction_key);
+extern __init int load_module_cert(struct key *keyring);
 
 #else
 #define restrict_link_by_builtin_trusted restrict_link_reject
+
+static inline __init int load_module_cert(struct key *keyring)
+{
+   return 0;
+}
+
 #endif
 
 #ifdef CONFIG_SECONDARY_TRUSTED_KEYRING
diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c

[PATCH v4 1/3] keys: cleanup build time module signing keys

2021-04-09 Thread Nayna Jain
The "mrproper" target is still looking for build time generated keys in
the kernel root directory instead of certs directory. Fix the path and
remove the names of the files which are no longer generated.

Fixes: cfc411e7fff3 ("Move certificate handling to its own directory")
Signed-off-by: Nayna Jain 
Reviewed-by: Stefan Berger 
Reviewed-by: Mimi Zohar 
Reviewed-by: Jarkko Sakkinen 
---
 Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index cc77fd45ca64..d64c94f41edb 100644
--- a/Makefile
+++ b/Makefile
@@ -1523,9 +1523,9 @@ MRPROPER_FILES += include/config include/generated
  \
  debian snap tar-install \
  .config .config.old .version \
  Module.symvers \
- signing_key.pem signing_key.priv signing_key.x509 \
- x509.genkey extra_certificates signing_key.x509.keyid \
- signing_key.x509.signer vmlinux-gdb.py \
+ certs/signing_key.pem certs/signing_key.x509 \
+ certs/x509.genkey \
+ vmlinux-gdb.py \
  *.spec
 
 # Directories & files removed with 'make distclean'
-- 
2.29.2



[PATCH v4 2/3] ima: enable signing of modules with build time generated key

2021-04-09 Thread Nayna Jain
The kernel build process currently only signs kernel modules when
MODULE_SIG is enabled. Also, sign the kernel modules at build time when
IMA_APPRAISE_MODSIG is enabled.

Signed-off-by: Nayna Jain 
Acked-by: Stefan Berger 
---
 certs/Kconfig  | 2 +-
 certs/Makefile | 8 
 init/Kconfig   | 6 +++---
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/certs/Kconfig b/certs/Kconfig
index c94e93d8bccf..48675ad319db 100644
--- a/certs/Kconfig
+++ b/certs/Kconfig
@@ -4,7 +4,7 @@ menu "Certificates for signature checking"
 config MODULE_SIG_KEY
string "File name or PKCS#11 URI of module signing key"
default "certs/signing_key.pem"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  Provide the file name of a private key/certificate in PEM format,
  or a PKCS#11 URI according to RFC7512. The file should contain, or
diff --git a/certs/Makefile b/certs/Makefile
index f4c25b67aad9..e3185c57fbd8 100644
--- a/certs/Makefile
+++ b/certs/Makefile
@@ -32,6 +32,14 @@ endif # CONFIG_SYSTEM_TRUSTED_KEYRING
 clean-files := x509_certificate_list .x509.list
 
 ifeq ($(CONFIG_MODULE_SIG),y)
+   SIGN_KEY = y
+endif
+
+ifeq ($(CONFIG_IMA_APPRAISE_MODSIG),y)
+   SIGN_KEY = y
+endif
+
+ifdef SIGN_KEY
 ###
 #
 # If module signing is requested, say by allyesconfig, but a key has not been
diff --git a/init/Kconfig b/init/Kconfig
index 5f5c776ef192..85e48a578f90 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2164,7 +2164,7 @@ config MODULE_SIG_FORCE
 config MODULE_SIG_ALL
bool "Automatically sign all modules"
default y
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  Sign all modules during make modules_install. Without this option,
  modules must be signed manually, using the scripts/sign-file tool.
@@ -2174,7 +2174,7 @@ comment "Do not forget to sign required modules with 
scripts/sign-file"
 
 choice
prompt "Which hash algorithm should modules be signed with?"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  This determines which sort of hashing algorithm will be used during
  signature generation.  This algorithm _must_ be built into the kernel
@@ -2206,7 +2206,7 @@ endchoice
 
 config MODULE_SIG_HASH
string
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
default "sha1" if MODULE_SIG_SHA1
default "sha224" if MODULE_SIG_SHA224
default "sha256" if MODULE_SIG_SHA256
-- 
2.29.2



Re: [PATCH 3/3] media: venus: don't de-reference NULL pointers at IRQ time

2021-04-09 Thread Stanimir Varbanov
Hi Mauro,

On 4/8/21 10:40 AM, Mauro Carvalho Chehab wrote:
> Smatch is warning that:
>   drivers/media/platform/qcom/venus/hfi_venus.c:1100 venus_isr() warn: 
> variable dereferenced before check 'hdev' (see line 1097)
> 
> The logic basically does:
>   hdev = to_hfi_priv(core);
> 
> with is translated to:
>   hdev = core->priv;
> 
> If the IRQ code can receive a NULL pointer for hdev, there's
> a bug there, as it will first try to de-reference the pointer,
> and then check if it is null.
> 
> After looking at the code, it seems that this indeed can happen:
> Basically, the venus IRQ thread is started with:
>   devm_request_threaded_irq()
> So, it will only be freed after the driver unbinds.
> 
> In order to prevent the IRQ code to work with freed data,
> the logic at venus_hfi_destroy() sets core->priv to NULL,
> which would make the IRQ code to ignore any pending IRQs.
> 
> There is, however a race condition, as core->priv is set
> to NULL only after being freed. So, we need also to move the
> core->priv = NULL to happen earlier.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  drivers/media/platform/qcom/venus/hfi_venus.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)

Acked-by: Stanimir Varbanov 

> 
> diff --git a/drivers/media/platform/qcom/venus/hfi_venus.c 
> b/drivers/media/platform/qcom/venus/hfi_venus.c
> index cebb20cf371f..ce98c523b3c6 100644
> --- a/drivers/media/platform/qcom/venus/hfi_venus.c
> +++ b/drivers/media/platform/qcom/venus/hfi_venus.c
> @@ -1094,12 +1094,15 @@ static irqreturn_t venus_isr(struct venus_core *core)
>  {
>   struct venus_hfi_device *hdev = to_hfi_priv(core);
>   u32 status;
> - void __iomem *cpu_cs_base = hdev->core->cpu_cs_base;
> - void __iomem *wrapper_base = hdev->core->wrapper_base;
> + void __iomem *cpu_cs_base;
> + void __iomem *wrapper_base;
>  
>   if (!hdev)
>   return IRQ_NONE;
>  
> + cpu_cs_base = hdev->core->cpu_cs_base;
> + wrapper_base = hdev->core->wrapper_base;
> +
>   status = readl(wrapper_base + WRAPPER_INTR_STATUS);
>   if (IS_V6(core)) {
>   if (status & WRAPPER_INTR_STATUS_A2H_MASK ||
> @@ -1650,10 +1653,10 @@ void venus_hfi_destroy(struct venus_core *core)
>  {
>   struct venus_hfi_device *hdev = to_hfi_priv(core);
>  
> + core->priv = NULL;
>   venus_interface_queues_release(hdev);
>   mutex_destroy(>lock);
>   kfree(hdev);
> - core->priv = NULL;
>   core->ops = NULL;
>  }
>  
> 

-- 
regards,
Stan


[RFC PATCH] usb: core: reduce power-on-good delay time of root hub

2021-04-08 Thread Chunfeng Yun
Return the exactly delay time given by root hub descriptor,
this helps to reduce resume time etc.

Due to the root hub descriptor is usually provided by the host
controller driver, if there is compatibility for a root hub,
we can fix it easily without affect other root hub

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/core/hub.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
index 73f4482d833a..22ea1f4f2d66 100644
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -148,8 +148,10 @@ static inline unsigned hub_power_on_good_delay(struct 
usb_hub *hub)
 {
unsigned delay = hub->descriptor->bPwrOn2PwrGood * 2;
 
-   /* Wait at least 100 msec for power to become stable */
-   return max(delay, 100U);
+   if (!hub->hdev->parent) /* root hub */
+   return delay;
+   else /* Wait at least 100 msec for power to become stable */
+   return max(delay, 100U);
 }
 
 static inline int hub_port_debounce_be_connected(struct usb_hub *hub,
-- 
2.18.0



Re: [RFC PATCH v6 1/1] cmdline: Add capability to both append and prepend at the same time

2021-04-08 Thread Rob Herring
On Sun, Apr 4, 2021 at 12:20 PM Christophe Leroy
 wrote:
>
> One user has expressed the need to both append and prepend some
> built-in parameters to the command line provided by the bootloader.
>
> Allthough it is a corner case, it is easy to implement so let's do it.
>
> When the user chooses to prepend the bootloader provided command line
> with the built-in command line, he is offered the possibility to enter
> an additionnal built-in command line to be appended after the
> bootloader provided command line.
>
> It is a complementary feature which has no impact on the already
> existing ones and/or the existing defconfig.
>
> Suggested-by: Daniel Walker 
> Signed-off-by: Christophe Leroy 
> ---
> Sending this out as an RFC, applies on top of the series
> ("Implement GENERIC_CMDLINE"). I will add it to the series next spin
> unless someone is against it.

Well, it works, but you are working around the existing kconfig and
the result is not great. You'd never design it this way.

Rob


[PATCH 3/3] media: venus: don't de-reference NULL pointers at IRQ time

2021-04-08 Thread Mauro Carvalho Chehab
Smatch is warning that:
drivers/media/platform/qcom/venus/hfi_venus.c:1100 venus_isr() warn: 
variable dereferenced before check 'hdev' (see line 1097)

The logic basically does:
hdev = to_hfi_priv(core);

with is translated to:
hdev = core->priv;

If the IRQ code can receive a NULL pointer for hdev, there's
a bug there, as it will first try to de-reference the pointer,
and then check if it is null.

After looking at the code, it seems that this indeed can happen:
Basically, the venus IRQ thread is started with:
devm_request_threaded_irq()
So, it will only be freed after the driver unbinds.

In order to prevent the IRQ code to work with freed data,
the logic at venus_hfi_destroy() sets core->priv to NULL,
which would make the IRQ code to ignore any pending IRQs.

There is, however a race condition, as core->priv is set
to NULL only after being freed. So, we need also to move the
core->priv = NULL to happen earlier.

Signed-off-by: Mauro Carvalho Chehab 
---
 drivers/media/platform/qcom/venus/hfi_venus.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/qcom/venus/hfi_venus.c 
b/drivers/media/platform/qcom/venus/hfi_venus.c
index cebb20cf371f..ce98c523b3c6 100644
--- a/drivers/media/platform/qcom/venus/hfi_venus.c
+++ b/drivers/media/platform/qcom/venus/hfi_venus.c
@@ -1094,12 +1094,15 @@ static irqreturn_t venus_isr(struct venus_core *core)
 {
struct venus_hfi_device *hdev = to_hfi_priv(core);
u32 status;
-   void __iomem *cpu_cs_base = hdev->core->cpu_cs_base;
-   void __iomem *wrapper_base = hdev->core->wrapper_base;
+   void __iomem *cpu_cs_base;
+   void __iomem *wrapper_base;
 
if (!hdev)
return IRQ_NONE;
 
+   cpu_cs_base = hdev->core->cpu_cs_base;
+   wrapper_base = hdev->core->wrapper_base;
+
status = readl(wrapper_base + WRAPPER_INTR_STATUS);
if (IS_V6(core)) {
if (status & WRAPPER_INTR_STATUS_A2H_MASK ||
@@ -1650,10 +1653,10 @@ void venus_hfi_destroy(struct venus_core *core)
 {
struct venus_hfi_device *hdev = to_hfi_priv(core);
 
+   core->priv = NULL;
venus_interface_queues_release(hdev);
mutex_destroy(>lock);
kfree(hdev);
-   core->priv = NULL;
core->ops = NULL;
 }
 
-- 
2.30.2



Re: [PATCH] mtd: add OTP (one-time-programmable) erase ioctl

2021-04-08 Thread Miquel Raynal
Hello,

Michael Walle  wrote on Thu, 08 Apr 2021 08:55:42
+0200:

> Hi Tudor,
> 
> Am 2021-04-08 07:51, schrieb tudor.amba...@microchip.com:
> > Would you please resend this patch, together with the mtd-utils
> > and the SPI NOR patch in a single patch set? You'll help us all
> > having all in a single place.  
> 
> This has already been picked-up:
> https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/commit/?h=mtd/next=e3c1f1c92d6ede3cfa09d6a103d3d1c1ef645e35
> 
> Although, I didn't receive an email notice.
> 
> -michael

Sometimes the notifications are not triggered when there is a conflict
when applying the patch from patchwork directly. I usually answer
manually in this case but I might have forgotten.

About the patch, I felt it was good enough for merging, and I want to
avoid applying such patches right before freezing our branches. Hence,
I tend to be more aggressive earlier in the release cycles because I
hate when my patches get delayed infinitely. The other side is a more
careful approach when -rc6 gets tagged so that I can drop anything which
would be crazily broken before our -next branches are stalled, leading
for an useless public revert. Of course, I am fully open to removing
this patch from -next if you ever feel it was too early and will
happily get rid of it for this release: we can move the patch for the
next release if you agree on this (especially since it touches the
ABI).

Cheers,
Miquèl


Re: [PATCH] mtd: add OTP (one-time-programmable) erase ioctl

2021-04-08 Thread Michael Walle

Hi Tudor,

Am 2021-04-08 07:51, schrieb tudor.amba...@microchip.com:

Would you please resend this patch, together with the mtd-utils
and the SPI NOR patch in a single patch set? You'll help us all
having all in a single place.


This has already been picked-up:
https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git/commit/?h=mtd/next=e3c1f1c92d6ede3cfa09d6a103d3d1c1ef645e35

Although, I didn't receive an email notice.

-michael


Re: [PATCH] mtd: add OTP (one-time-programmable) erase ioctl

2021-04-07 Thread Tudor.Ambarus
Michael,

Would you please resend this patch, together with the mtd-utils
and the SPI NOR patch in a single patch set? You'll help us all
having all in a single place.

For the new ioctl we'll need acks from all the mtd maintainers
and at least a tested-by tag.

Cheers,
ta


[PATCH v13 07/18] arm64: kexec: flush image and lists during kexec load time

2021-04-07 Thread Pavel Tatashin
Currently, during kexec load we are copying relocation function and
flushing it. However, we can also flush kexec relocation buffers and
if new kernel image is already in place (i.e. crash kernel), we can
also flush the new kernel image itself.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 49 +++
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 90a335c74442..3a034bc25709 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -59,23 +59,6 @@ void machine_kexec_cleanup(struct kimage *kimage)
/* Empty routine needed to avoid build errors. */
 }
 
-int machine_kexec_post_load(struct kimage *kimage)
-{
-   void *reloc_code = page_to_virt(kimage->control_code_page);
-
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
-   kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
-
-   /* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
-   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
-
-   return 0;
-}
-
 /**
  * machine_kexec_prepare - Prepare for a kexec reboot.
  *
@@ -152,6 +135,29 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+int machine_kexec_post_load(struct kimage *kimage)
+{
+   void *reloc_code = page_to_virt(kimage->control_code_page);
+
+   /* If in place flush new kernel image, else flush lists and buffers */
+   if (kimage->head & IND_DONE)
+   kexec_segment_flush(kimage);
+   else
+   kexec_list_flush(kimage);
+
+   memcpy(reloc_code, arm64_relocate_new_kernel,
+  arm64_relocate_new_kernel_size);
+   kimage->arch.kern_reloc = __pa(reloc_code);
+   kexec_image_info(kimage);
+
+   /* Flush the reloc_code in preparation for its execution. */
+   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
+  arm64_relocate_new_kernel_size);
+
+   return 0;
+}
+
 /**
  * machine_kexec - Do the kexec reboot.
  *
@@ -169,13 +175,6 @@ void machine_kexec(struct kimage *kimage)
WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()),
"Some CPUs may be stale, kdump will be unreliable.\n");
 
-   /* Flush the kimage list and its buffers. */
-   kexec_list_flush(kimage);
-
-   /* Flush the new image if already in place. */
-   if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE))
-   kexec_segment_flush(kimage);
-
pr_info("Bye!\n");
 
local_daif_mask();
@@ -250,8 +249,6 @@ void arch_kexec_protect_crashkres(void)
 {
int i;
 
-   kexec_segment_flush(kexec_crash_image);
-
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
-- 
2.25.1



Re: [PATCH] kernel/time: Feedback reply for hr_sleep syscall, a fine-grained sleep service

2021-04-07 Thread Thomas Gleixner
Marco!

On Wed, Apr 07 2021 at 11:32, Marco Faltelli wrote:

> Current sleep services (nanosleep) provide sleep periods very far from
> the expectations when scheuling microsecond-scale timers. On our
> testbed, using rdtscp() before and after a nanosleep() syscall to
> measure the effective elapsed time with a 1us timer, we got ~59us.
> Even with larger timeout periods, the difference is still evident
> (e.g., with a 100us timer, we measured ~158us of elapsed time).

So the delta is a constant of ~50us, right?

> We believe that one of the reasons is the use of the timespec
> structure, that needs to be copied for user to kernel and then
> converted into a single-value representation.

Interesting.

> In our work Metronome
> (https://dl.acm.org/doi/pdf/10.1145/3386367.3432730) we had the need
> for a precise microsecond-granularity sleep service, as nanosleep()
> was far from our needs, so we developed hr_sleep(), a new sleep
> service.

The above 'interesting' assumption made me curious about the deeper
insight, so I went to read. Let me give you a few comments on that
paper.

> In current conventional implementations of the Linux kernel, the support
> for (fine-grain) sleep periods of threads is based on the nanosleep()
> system call, which has index 35 in the current specification of the
> x86-64/Linux system call table.

There is also clock_nanosleep(2) for completness sake...

> The actual execution path of this system call, particularly at kernel
> side, is shown in Figure 1a. When entering kernel mode the thread exploits 
> two main kernel
> level subsystems. One is the scheduling subsystem, which allows managing
> the run-queue of threads that can be rescheduled in CPU. The other one
> is the high-resolution timers subsystem, which allows posting
> timer-expiration requests to the Linux kernel timer wheel.

The timer wheel is not involved in this at all. If your timer would end
up on the timer wheel your observed latencies would be in the
milliseconds range not in the 50usec range.

> The latter is a data structure that keeps the ordered set of timer
> expiration requests, so that each time one of these timers expires the
> subsequent timer expiration request is activated.

Not exactly how the timer wheel in the kernel works, but that's
irrelevant because it is not involved in this.

> The expiration of a timer is associated with the interrupt coming from
> the High Precision Event Timer (HPET) on board of x86 processors.

You must have a really old and crappy machine to test on. HPET is
avoided on any CPU less than 10 years old and pretty much irrelevant or
a performace horror on any machine which has more than 4 cores.

> In any case, independently of whether preemption will occur, the CPU
> cycles spent for that preamble lead to a delay for the post of the
> timer- expiration request to the timer wheel, leading the thread to
> actually start its timed-sleep phase with a delay.

I would have expected a proper measurement of the delay which is caused
by that processing in the paper, but in absence of that I instrumented
it for you:

First of all I implemented the thing myself, because the crud you posted
fails to compile (see below) and for other reasons which I spare myself
to explain because of that.

The regular clock_nanosleep() over the hacked up nanosleep_u64(), which
just takes a u64 nanosecond value as argument instead of the timespec
pointer has an overhead of ~64 CPU clock cycles on average according to
'perf stat' which amounts to a whopping 32 nanoseconds per syscall on my
test machine running at 2 GHz.

That's _three_ orders of magnitude off from 50us. There goes the theory.

So now where are these 50 microseconds coming from?

There is no massive software/hardware induced overhead caused by the
timespec pointer handling at all, the 50 microseconds are simply the
default timer slack value which is added to the expiry value to allow
better batching of timer expiries. 

That slack is automatically zero for tasks in real time scheduling
classes and can also be modified by a system wide setting and per
process via prctl(PR_SET_TIMERSLACK, .) except a system policy
prevents that.

That prtcl has unfortunately a severe limitation: it does not allow to
set the slack value to 0, the mininum values is 1 nanosecond, and I'm
happy to discuss that when you come up with a proper scientific proof
that that _one_ nanosecond matters.

As a limited excuse I concede, that the timer slack is barely
documented, but i'm thorougly surprised that this has not been figured
out and instead of that weird theories about the syscall entry code
implications make up several pages of handwaving content of a published
and 'reviewed' academic paper.

So here is the comparison between the regular clock_nanosleep() with the
prctl() used and the u64 based variant which sets the

Re: [RESEND] i2c: mediatek: Get device clock-stretch time via dts

2021-04-07 Thread Wolfram Sang

> Due to clock stretch, our HW IP cannot meet the ac-timing
> spec(tSU;STA,tSU;STO). 
> There isn't a same delay for clock stretching, so we need pass a
> parameter which can be found through measurement to meet most
> conditions.

What about using this existing binding?

- i2c-scl-internal-delay-ns
Number of nanoseconds the IP core additionally needs to setup SCL.



signature.asc
Description: PGP signature


Re: [PATCH] platform/surface: aggregator_registry: Give devices time to set up when connecting

2021-04-07 Thread Hans de Goede
Hi,

On 4/6/21 1:12 AM, Maximilian Luz wrote:
> Sometimes, the "base connected" event that we rely on to (re-)attach the
> device connected to the base is sent a bit too early. When this happens,
> some devices may not be completely ready yet.
> 
> Specifically, the battery has been observed to report zero-values for
> things like full charge capacity, which, however, is only loaded once
> when the driver for that device probes. This can thus result in battery
> readings being unavailable.
> 
> As we cannot easily and reliably discern between devices that are not
> ready yet and devices that are not connected (i.e. will never be ready),
> delay adding these devices. This should give them enough time to set up.
> 
> The delay is set to 2.5 seconds, which should give us a good safety
> margin based on testing and still be fairly responsive for users.
> 
> To achieve that delay switch to updating via a delayed work struct,
> which means that we can also get rid of some locking.
> 
> Signed-off-by: Maximilian Luz 

Thank you for your patch, I've applied this patch to my review-hans 
branch:
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/log/?h=review-hans

Note it will show up in my review-hans branch once I've pushed my
local branch there, which might take a while.

Once I've run some tests on this branch the patches there will be
added to the platform-drivers-x86/for-next branch and eventually
will be included in the pdx86 pull-request to Linus for the next
merge-window.

Regards,

Hans


> ---
>  .../surface/surface_aggregator_registry.c | 98 ---
>  1 file changed, 40 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/platform/surface/surface_aggregator_registry.c 
> b/drivers/platform/surface/surface_aggregator_registry.c
> index eccb9d1007cd..685d37a7add1 100644
> --- a/drivers/platform/surface/surface_aggregator_registry.c
> +++ b/drivers/platform/surface/surface_aggregator_registry.c
> @@ -13,10 +13,10 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -287,6 +287,13 @@ static int ssam_hub_add_devices(struct device *parent, 
> struct ssam_controller *c
>  
>  /* -- SSAM base-hub driver. 
> - */
>  
> +/*
> + * Some devices (especially battery) may need a bit of time to be fully 
> usable
> + * after being (re-)connected. This delay has been determined via
> + * experimentation.
> + */
> +#define SSAM_BASE_UPDATE_CONNECT_DELAY   msecs_to_jiffies(2500)
> +
>  enum ssam_base_hub_state {
>   SSAM_BASE_HUB_UNINITIALIZED,
>   SSAM_BASE_HUB_CONNECTED,
> @@ -296,8 +303,8 @@ enum ssam_base_hub_state {
>  struct ssam_base_hub {
>   struct ssam_device *sdev;
>  
> - struct mutex lock;  /* Guards state update checks and transitions. */
>   enum ssam_base_hub_state state;
> + struct delayed_work update_work;
>  
>   struct ssam_event_notifier notif;
>  };
> @@ -335,11 +342,7 @@ static ssize_t ssam_base_hub_state_show(struct device 
> *dev, struct device_attrib
>   char *buf)
>  {
>   struct ssam_base_hub *hub = dev_get_drvdata(dev);
> - bool connected;
> -
> - mutex_lock(>lock);
> - connected = hub->state == SSAM_BASE_HUB_CONNECTED;
> - mutex_unlock(>lock);
> + bool connected = hub->state == SSAM_BASE_HUB_CONNECTED;
>  
>   return sysfs_emit(buf, "%d\n", connected);
>  }
> @@ -356,16 +359,20 @@ static const struct attribute_group ssam_base_hub_group 
> = {
>   .attrs = ssam_base_hub_attrs,
>  };
>  
> -static int __ssam_base_hub_update(struct ssam_base_hub *hub, enum 
> ssam_base_hub_state new)
> +static void ssam_base_hub_update_workfn(struct work_struct *work)
>  {
> + struct ssam_base_hub *hub = container_of(work, struct ssam_base_hub, 
> update_work.work);
>   struct fwnode_handle *node = dev_fwnode(>sdev->dev);
> + enum ssam_base_hub_state state;
>   int status = 0;
>  
> - lockdep_assert_held(>lock);
> + status = ssam_base_hub_query_state(hub, );
> + if (status)
> + return;
>  
> - if (hub->state == new)
> - return 0;
> - hub->state = new;
> + if (hub->state == state)
> + return;
> + hub->state = state;
>  
>   if (hub->state == SSAM_BASE_HUB_CONNECTED)
>   status = ssam_hub_add_devices(>sdev->dev, hub->sdev->ctrl, 
> node);
> @@ -374,51 +381,28 @@ static int __ssam_base_hub_update(struct ssam_base_hub 
>

Re: [RESEND] i2c: mediatek: Get device clock-stretch time via dts

2021-04-07 Thread Qii Wang
On Tue, 2021-04-06 at 21:48 +0200, Wolfram Sang wrote:
> On Sat, Mar 13, 2021 at 04:04:24PM +0800, qii.w...@mediatek.com wrote:
> > From: Qii Wang 
> > 
> > tSU,STA/tHD,STA/tSU,STOP maybe out of spec due to device
> > clock-stretching or circuit loss, we could get device
> > clock-stretch time from dts to adjust these parameters
> > to meet the spec via EXT_CONF register.
> > 
> > Signed-off-by: Qii Wang 
> 
> I tried to understand from the code what the new binding expresses, but
> I don't fully understand it. Is it the maximum clock stretch time?
> Because I cannot recall a device which always uses the same delay for
> clock stretching.
> 

Due to clock stretch, our HW IP cannot meet the ac-timing
spec(tSU;STA,tSU;STO). 
There isn't a same delay for clock stretching, so we need pass a
parameter which can be found through measurement to meet most
conditions.



[PATCH] kernel/time: Feedback reply for hr_sleep syscall, a fine-grained sleep service

2021-04-07 Thread Marco Faltelli
Current sleep services (nanosleep) provide sleep periods very far from the 
expectations when scheuling microsecond-scale timers. On our testbed, using 
rdtscp() before and after a nanosleep() syscall to measure the effective 
elapsed time with a 1us timer, we got ~59us.
Even with larger timeout periods, the difference is still evident (e.g., with a 
100us timer, we measured ~158us of elapsed time).
We believe that one of the reasons is the use of the timespec structure, that 
needs to be copied for user to kernel and then converted into a single-value 
representation.
In our work Metronome (https://dl.acm.org/doi/pdf/10.1145/3386367.3432730) we 
had the need for a precise microsecond-granularity sleep service, as 
nanosleep() was far from our needs, so we developed hr_sleep(), a new sleep 
service. Since the sleep periods needed in our case are small, we don't want 
our sleep service to re-schedule a timer in case of a signal interruption, so 
it just returns -EINTR to the user. The user must be aware that this is a 
best-effort sleep service, so the sleep period specified is an upper-bound of 
the effective elapsed time.
We believe this patch can be useful in applications where fine-grained 
granularity is requested for small sleep periods, and re-scheduling the timer 
in case of a signal is not mandatory.
In the paper previously linked, Section 3.1 provides more details about 
hr_sleep and Section 3.3 extensively evaluates hr_sleep() and compares it to 
nanosleep(). For a 1us timeout, hr_sleep() elapses ~3.8us in mean vs. the ~59us 
of nanosleep().
hr_sleep has been previously submitted at 
https://lore.kernel.org/lkml/20210115180733.5663-1-marco.falte...@uniroma2.it/.
This commit answers to the previous feedback in 
https://lore.kernel.org/lkml/CALCETrWfnL=3m3nmmhs-a3si5jptsctf6cethvtsdnwa5mh...@mail.gmail.com/
 and applies the requested changes.

Signed-off-by: Marco Faltelli 
---
 arch/x86/entry/syscalls/syscall_64.tbl |  1 +
 kernel/time/hrtimer.c  | 67 ++
 2 files changed, 68 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 7bf01cbe582f..85b14dfa40fb 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -364,6 +364,7 @@
 440common  process_madvise sys_process_madvise
 441common  epoll_pwait2sys_epoll_pwait2
 442common  mount_setattr   sys_mount_setattr
+443common  hr_sleepsys_hr_sleep
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 4a66725b1d4a..887c01392e08 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2006,6 +2006,73 @@ SYSCALL_DEFINE2(nanosleep_time32, struct old_timespec32 
__user *, rqtp,
 }
 #endif
 
+
+
+#ifdef CONFIG_64BIT
+
+
+struct control_record {
+   struct task_struct *task;
+   int awake;
+   struct hrtimer hr_timer;
+};
+
+
+static enum hrtimer_restart hr_sleep_callback(struct hrtimer *timer)
+{
+   struct control_record *control;
+   struct task_struct *the_task;
+
+   control = (control_record *)container_of(timer, control_record, 
hr_timer);
+   control->awake = 1;
+   the_task = control->task;
+   wake_up_process(the_task);
+
+   return HRTIMER_NORESTART;
+}
+
+
+
+/**
+ * hr_sleep - a high-resolution sleep service for fine-grained timeouts
+ * @nanoseconds:   the requested sleep period in nanoseconds
+ *
+ * Returns:
+ * 0 when the sleep request successfully terminated
+ * -EINVAL if a sleep period < 0 is requested
+ * -EINTR if a signal interrupted the calling thread
+ */
+SYSCALL_DEFINE1(hr_sleep, long, nanoseconds)
+{
+   DECLARE_WAIT_QUEUE_HEAD(the_queue);
+   struct control_record control;
+   ktime_t ktime_interval;
+   struct restart_block *restart;
+
+   if (nanoseconds < 0)
+   return -EINVAL;
+
+   if (nanoseconds == 0)
+   return 0;
+
+   ktime_interval = ktime_set(0, nanoseconds);
+   hrtimer_init(&(control.hr_timer), CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   control.hr_timer.function = _sleep_callback;
+   control.task = current;
+   control.awake = 0;
+   hrtimer_start(&(control.hr_timer), ktime_interval, HRTIMER_MODE_REL);
+   wait_event_interruptible(the_queue, control.awake == 1);
+   hrtimer_cancel(&(control.hr_timer));
+   if (control.awake == 0)
+   //We have been interrupted by a signal
+   return -EINTR;
+   return 0;
+
+}
+
+#endif
+
+
 /*
  * Functions related to boot-time initialization:
  */
-- 
2.25.1



Re: [PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking

2021-04-06 Thread Sean Christopherson
On Tue, Mar 30, 2021, Wanpeng Li wrote:
> On Tue, 30 Mar 2021 at 01:15, Sean Christopherson  wrote:
> >
> > +Thomas
> >
> > On Mon, Mar 29, 2021, Wanpeng Li wrote:
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index 32cf828..85695b3 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct 
> > > kvm_vcpu *vcpu,
> > >* into world and some more.
> > >*/
> > >   lockdep_hardirqs_off(CALLER_ADDR0);
> > > - guest_exit_irqoff();
> > > + if (vtime_accounting_enabled_this_cpu())
> > > + guest_exit_irqoff();
> >
> > This looks ok, as CONFIG_CONTEXT_TRACKING and 
> > CONFIG_VIRT_CPU_ACCOUNTING_GEN are
> > selected by CONFIG_NO_HZ_FULL=y, and can't be enabled independently, e.g. 
> > the
> > rcu_user_exit() call won't be delayed because it will never be called in the
> > !vtime case.  But it still feels wrong poking into those details, e.g. it'll
> > be weird and/or wrong guest_exit_irqoff() gains stuff that isn't vtime 
> > specific.
> 
> Could you elaborate what's the meaning of "it'll be weird and/or wrong
> guest_exit_irqoff() gains stuff that isn't vtime specific."?

For example, if RCU logic is added to guest_exit_irqoff() that is needed
irrespective of vtime, then KVM will end up with different RCU logic depending
on whether or not vtime is enabled.  RCU is just an example.  My point is that
it doesn't seem impossible that there would be something in the future that
wants to tap into the guest->host transition.

Maybe that never happens and the vtime check is perfectly ok, but for me, the
name guest_exit_irqoff() doesn't sound like something that should hinge on
time accounting being enabled.


Re: [RESEND] i2c: mediatek: Get device clock-stretch time via dts

2021-04-06 Thread Wolfram Sang
On Sat, Mar 13, 2021 at 04:04:24PM +0800, qii.w...@mediatek.com wrote:
> From: Qii Wang 
> 
> tSU,STA/tHD,STA/tSU,STOP maybe out of spec due to device
> clock-stretching or circuit loss, we could get device
> clock-stretch time from dts to adjust these parameters
> to meet the spec via EXT_CONF register.
> 
> Signed-off-by: Qii Wang 

I tried to understand from the code what the new binding expresses, but
I don't fully understand it. Is it the maximum clock stretch time?
Because I cannot recall a device which always uses the same delay for
clock stretching.



signature.asc
Description: PGP signature


[PATCH] platform/surface: aggregator_registry: Give devices time to set up when connecting

2021-04-05 Thread Maximilian Luz
Sometimes, the "base connected" event that we rely on to (re-)attach the
device connected to the base is sent a bit too early. When this happens,
some devices may not be completely ready yet.

Specifically, the battery has been observed to report zero-values for
things like full charge capacity, which, however, is only loaded once
when the driver for that device probes. This can thus result in battery
readings being unavailable.

As we cannot easily and reliably discern between devices that are not
ready yet and devices that are not connected (i.e. will never be ready),
delay adding these devices. This should give them enough time to set up.

The delay is set to 2.5 seconds, which should give us a good safety
margin based on testing and still be fairly responsive for users.

To achieve that delay switch to updating via a delayed work struct,
which means that we can also get rid of some locking.

Signed-off-by: Maximilian Luz 
---
 .../surface/surface_aggregator_registry.c | 98 ---
 1 file changed, 40 insertions(+), 58 deletions(-)

diff --git a/drivers/platform/surface/surface_aggregator_registry.c 
b/drivers/platform/surface/surface_aggregator_registry.c
index eccb9d1007cd..685d37a7add1 100644
--- a/drivers/platform/surface/surface_aggregator_registry.c
+++ b/drivers/platform/surface/surface_aggregator_registry.c
@@ -13,10 +13,10 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -287,6 +287,13 @@ static int ssam_hub_add_devices(struct device *parent, 
struct ssam_controller *c
 
 /* -- SSAM base-hub driver. - 
*/
 
+/*
+ * Some devices (especially battery) may need a bit of time to be fully usable
+ * after being (re-)connected. This delay has been determined via
+ * experimentation.
+ */
+#define SSAM_BASE_UPDATE_CONNECT_DELAY msecs_to_jiffies(2500)
+
 enum ssam_base_hub_state {
SSAM_BASE_HUB_UNINITIALIZED,
SSAM_BASE_HUB_CONNECTED,
@@ -296,8 +303,8 @@ enum ssam_base_hub_state {
 struct ssam_base_hub {
struct ssam_device *sdev;
 
-   struct mutex lock;  /* Guards state update checks and transitions. */
enum ssam_base_hub_state state;
+   struct delayed_work update_work;
 
struct ssam_event_notifier notif;
 };
@@ -335,11 +342,7 @@ static ssize_t ssam_base_hub_state_show(struct device 
*dev, struct device_attrib
char *buf)
 {
struct ssam_base_hub *hub = dev_get_drvdata(dev);
-   bool connected;
-
-   mutex_lock(>lock);
-   connected = hub->state == SSAM_BASE_HUB_CONNECTED;
-   mutex_unlock(>lock);
+   bool connected = hub->state == SSAM_BASE_HUB_CONNECTED;
 
return sysfs_emit(buf, "%d\n", connected);
 }
@@ -356,16 +359,20 @@ static const struct attribute_group ssam_base_hub_group = 
{
.attrs = ssam_base_hub_attrs,
 };
 
-static int __ssam_base_hub_update(struct ssam_base_hub *hub, enum 
ssam_base_hub_state new)
+static void ssam_base_hub_update_workfn(struct work_struct *work)
 {
+   struct ssam_base_hub *hub = container_of(work, struct ssam_base_hub, 
update_work.work);
struct fwnode_handle *node = dev_fwnode(>sdev->dev);
+   enum ssam_base_hub_state state;
int status = 0;
 
-   lockdep_assert_held(>lock);
+   status = ssam_base_hub_query_state(hub, );
+   if (status)
+   return;
 
-   if (hub->state == new)
-   return 0;
-   hub->state = new;
+   if (hub->state == state)
+   return;
+   hub->state = state;
 
if (hub->state == SSAM_BASE_HUB_CONNECTED)
status = ssam_hub_add_devices(>sdev->dev, hub->sdev->ctrl, 
node);
@@ -374,51 +381,28 @@ static int __ssam_base_hub_update(struct ssam_base_hub 
*hub, enum ssam_base_hub_
 
if (status)
dev_err(>sdev->dev, "failed to update base-hub devices: 
%d\n", status);
-
-   return status;
-}
-
-static int ssam_base_hub_update(struct ssam_base_hub *hub)
-{
-   enum ssam_base_hub_state state;
-   int status;
-
-   mutex_lock(>lock);
-
-   status = ssam_base_hub_query_state(hub, );
-   if (!status)
-   status = __ssam_base_hub_update(hub, state);
-
-   mutex_unlock(>lock);
-   return status;
 }
 
 static u32 ssam_base_hub_notif(struct ssam_event_notifier *nf, const struct 
ssam_event *event)
 {
-   struct ssam_base_hub *hub;
-   struct ssam_device *sdev;
-   enum ssam_base_hub_state new;
-
-   hub = container_of(nf, struct ssam_base_hub, notif);
-   sdev = hub->sdev;
+   struct ssam_base_hub *hub = container_of(nf, struct ssam_base_hub, 
notif);
+   unsigned long delay;
 
if (event->command_id != SSAM_EVENT_BAS_CID_CONNECTION)
return 0;
 
if (event->length < 1) 

Re: [PATCH RESEND v1 4/4] powerpc/vdso: Add support for time namespaces

2021-04-04 Thread Andrei Vagin
On Wed, Mar 31, 2021 at 04:48:47PM +, Christophe Leroy wrote:
> This patch adds the necessary glue to provide time namespaces.
> 
> Things are mainly copied from ARM64.
> 
> __arch_get_timens_vdso_data() calculates timens vdso data position
> based on the vdso data position, knowing it is the next page in vvar.
> This avoids having to redo the mflr/bcl/mflr/mtlr dance to locate
> the page relative to running code position.
>

Acked-by: Andrei Vagin 
 
> Signed-off-by: Christophe Leroy 


[RFC PATCH v6 1/1] cmdline: Add capability to both append and prepend at the same time

2021-04-04 Thread Christophe Leroy
One user has expressed the need to both append and prepend some
built-in parameters to the command line provided by the bootloader.

Allthough it is a corner case, it is easy to implement so let's do it.

When the user chooses to prepend the bootloader provided command line
with the built-in command line, he is offered the possibility to enter
an additionnal built-in command line to be appended after the
bootloader provided command line.

It is a complementary feature which has no impact on the already
existing ones and/or the existing defconfig.

Suggested-by: Daniel Walker 
Signed-off-by: Christophe Leroy 
---
Sending this out as an RFC, applies on top of the series
("Implement GENERIC_CMDLINE"). I will add it to the series next spin
unless someone is against it.
---
 include/linux/cmdline.h |  3 +++
 init/Kconfig| 12 +++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/cmdline.h b/include/linux/cmdline.h
index 020028e2bdf0..fb274a4d5519 100644
--- a/include/linux/cmdline.h
+++ b/include/linux/cmdline.h
@@ -36,6 +36,9 @@ static __always_inline bool __cmdline_build(char *dst, const 
char *src)
 
len = cmdline_strlcat(dst, src, COMMAND_LINE_SIZE);
 
+   if (IS_ENABLED(CONFIG_CMDLINE_PREPEND))
+   len = cmdline_strlcat(dst, " " CONFIG_CMDLINE_MORE, 
COMMAND_LINE_SIZE);
+
if (IS_ENABLED(CONFIG_CMDLINE_APPEND))
len = cmdline_strlcat(dst, " " CONFIG_CMDLINE, 
COMMAND_LINE_SIZE);
 
diff --git a/init/Kconfig b/init/Kconfig
index fa002e3765ab..cd3087ff4f28 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -128,6 +128,14 @@ config CMDLINE
  If this string is not empty, additional choices are proposed
  below to determine how it will be used by the kernel.
 
+config CMDLINE_MORE
+   string "Additional default kernel command string" if GENERIC_CMDLINE && 
CMDLINE_PREPEND
+   default ""
+   help
+ Defines an additional default kernel command string.
+ If this string is not empty, it is appended to the
+ command-line arguments provided by the bootloader
+
 choice
prompt "Kernel command line type" if CMDLINE != ""
default CMDLINE_PREPEND if ARCH_WANT_CMDLINE_PREPEND_BY_DEFAULT
@@ -154,7 +162,9 @@ config CMDLINE_PREPEND
bool "Prepend to the bootloader kernel arguments"
help
  The default kernel command string will be prepended to the
- command-line arguments provided by the bootloader.
+ command-line arguments provided by the bootloader. When this
+ option is selected, another string can be added which will
+ be appended.
 
 config CMDLINE_FORCE
bool "Always use the default kernel command string"
-- 
2.25.0



Re: [PATCH net-next v1 6/9] net: dsa: qca: ar9331: add ageing time support

2021-04-03 Thread Florian Fainelli




On 4/3/2021 04:48, Oleksij Rempel wrote:

This switch provides global ageing time configuration, so let DSA use
it.

Signed-off-by: Oleksij Rempel 


Reviewed-by: Florian Fainelli 
--
Florian


Re: [PATCH] perf record: Disallow -c and -F option at the same time

2021-04-03 Thread Arnaldo Carvalho de Melo
Em Fri, Apr 02, 2021 at 08:25:30PM -0700, Alexey Alexandrov escreveu:
> A warning can be missed when the tool is run by some kind of automation.
> Backward compatibility aside, I think conflicting flags should result in an
> early exit to avoid later surprises.

Sure, I agree with you in principle, but having erred out in the past,
i.e. in making this be accepted, now making this out of the blue finally
be considered what it always should have been considered, an error,
feels like an error.

I sent this message after merging the change, but before pushing it out
publicly I felt some (more) discussion would be in order.

Are you sure that potentially breaking existing scripts is ok in this
case?

Up to you, frankly.

- Arnaldo
 
> On Fri, Apr 2, 2021 at 6:37 AM Arnaldo Carvalho de Melo 
> wrote:
> 
> > Em Fri, Apr 02, 2021 at 06:40:20PM +0900, Namhyung Kim escreveu:
> > > It's confusing which one is effective when the both options are given.
> > > The current code happens to use -c in this case but users might not be
> > > aware of it.  We can change it to complain about that instead of
> > > relying on the implicit priority.
> > >
> > > Before:
> > >   $ perf record -c 11 -F 99 true
> > >   [ perf record: Woken up 1 times to write data ]
> > >   [ perf record: Captured and wrote 0.031 MB perf.data (8 samples) ]
> > >
> > >   $ perf evlist -F
> > >   cycles: sample_period=11
> > >
> > > After:
> > >   $ perf record -c 11 -F 99 true
> > >   cannot set frequency and period at the same time
> > >
> > > So this change can break existing usages, but I think it's rare to
> > > have both options and it'd be better changing them.
> >
> > Humm, perhaps we can just make that an warning stating that -c is used
> > if both are specified?
> >
> > $ perf record -c 11 -F 99 true
> > Frequency and period can't be used the same time, -c 1 will be used.
> >
> > - Arnaldo
> >
> > > Suggested-by: Alexey Alexandrov 
> > > Signed-off-by: Namhyung Kim 
> > > ---
> > >  tools/perf/util/record.c | 8 +++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> > > index f99852d54b14..43e5b563dee8 100644
> > > --- a/tools/perf/util/record.c
> > > +++ b/tools/perf/util/record.c
> > > @@ -157,9 +157,15 @@ static int get_max_rate(unsigned int *rate)
> > >  static int record_opts__config_freq(struct record_opts *opts)
> > >  {
> > >   bool user_freq = opts->user_freq != UINT_MAX;
> > > + bool user_interval = opts->user_interval != ULLONG_MAX;
> > >   unsigned int max_rate;
> > >
> > > - if (opts->user_interval != ULLONG_MAX)
> > > + if (user_interval && user_freq) {
> > > + pr_err("cannot set frequency and period at the same
> > time\n");
> > > + return -1;
> > > + }
> > > +
> > > + if (user_interval)
> > >   opts->default_interval = opts->user_interval;
> > >   if (user_freq)
> > >   opts->freq = opts->user_freq;
> > > --
> > > 2.31.0.208.g409f899ff0-goog
> > >
> >
> > --
> >
> > - Arnaldo
> >

-- 

- Arnaldo


Re: [PATCH net-next v1 6/9] net: dsa: qca: ar9331: add ageing time support

2021-04-03 Thread Andrew Lunn
On Sat, Apr 03, 2021 at 01:48:45PM +0200, Oleksij Rempel wrote:
> This switch provides global ageing time configuration, so let DSA use
> it.
> 
> Signed-off-by: Oleksij Rempel 

Reviewed-by: Andrew Lunn 

Andrew


[PATCH net-next v1 6/9] net: dsa: qca: ar9331: add ageing time support

2021-04-03 Thread Oleksij Rempel
This switch provides global ageing time configuration, so let DSA use
it.

Signed-off-by: Oleksij Rempel 
---
 drivers/net/dsa/qca/ar9331.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/dsa/qca/ar9331.c b/drivers/net/dsa/qca/ar9331.c
index 4a98f14f31f4..b2c22ba924f0 100644
--- a/drivers/net/dsa/qca/ar9331.c
+++ b/drivers/net/dsa/qca/ar9331.c
@@ -1115,6 +1115,25 @@ static void ar9331_sw_port_fast_age(struct dsa_switch 
*ds, int port)
dev_err_ratelimited(priv->dev, "%s: error: %i\n", __func__, ret);
 }
 
+static int ar9331_sw_set_ageing_time(struct dsa_switch *ds,
+unsigned int ageing_time)
+{
+   struct ar9331_sw_priv *priv = (struct ar9331_sw_priv *)ds->priv;
+   struct regmap *regmap = priv->regmap;
+   u32 time, val;
+
+   time = DIV_ROUND_UP(ageing_time, AR9331_SW_AT_AGE_TIME_COEF);
+   if (!time)
+   time = 1;
+   else if (time > U16_MAX)
+   time = U16_MAX;
+
+   val = FIELD_PREP(AR9331_SW_AT_AGE_TIME, time) | AR9331_SW_AT_AGE_EN;
+   return regmap_update_bits(regmap, AR9331_SW_REG_ADDR_TABLE_CTRL,
+ AR9331_SW_AT_AGE_EN | AR9331_SW_AT_AGE_TIME,
+ val);
+}
+
 static const struct dsa_switch_ops ar9331_sw_ops = {
.get_tag_protocol   = ar9331_sw_get_tag_protocol,
.setup  = ar9331_sw_setup,
@@ -1130,6 +1149,7 @@ static const struct dsa_switch_ops ar9331_sw_ops = {
.port_fdb_dump  = ar9331_sw_port_fdb_dump,
.port_mdb_add   = ar9331_sw_port_mdb_add,
.port_mdb_del   = ar9331_sw_port_mdb_del,
+   .set_ageing_time= ar9331_sw_set_ageing_time,
 };
 
 static irqreturn_t ar9331_sw_irq(int irq, void *data)
@@ -1476,6 +1496,8 @@ static int ar9331_sw_probe(struct mdio_device *mdiodev)
priv->ops = ar9331_sw_ops;
ds->ops = >ops;
dev_set_drvdata(>dev, priv);
+   ds->ageing_time_min = AR9331_SW_AT_AGE_TIME_COEF;
+   ds->ageing_time_max = AR9331_SW_AT_AGE_TIME_COEF * U16_MAX;
 
for (i = 0; i < ARRAY_SIZE(priv->port); i++) {
struct ar9331_sw_port *port = >port[i];
-- 
2.29.2



[PATCH v2 5/7] perf arm-spe: Assign kernel time to synthesized event

2021-04-03 Thread Leo Yan
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 4cf558b0218a..80f5659e7f7e 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -234,7 +234,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
struct arm_spe_record *record = >decoder->record;
 
if (!spe->timeless_decoding)
-   sample->time = speq->timestamp;
+   sample->time = tsc_to_perf_time(record->timestamp, >tc);
 
sample->ip = record->from_ip;
sample->cpumode = arm_spe_cpumode(spe, sample->ip);
-- 
2.25.1



[PATCH v2 4/7] perf arm-spe: Convert event kernel time to counter value

2021-04-03 Thread Leo Yan
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 69ce3483d1af..4cf558b0218a 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -669,7 +669,7 @@ static int arm_spe_process_event(struct perf_session 
*session,
}
 
if (sample->time && (sample->time != (u64) -1))
-   timestamp = sample->time;
+   timestamp = perf_time_to_tsc(sample->time, >tc);
else
timestamp = 0;
 
-- 
2.25.1



Re: [PATCH] perf record: Disallow -c and -F option at the same time

2021-04-02 Thread Arnaldo Carvalho de Melo
Em Fri, Apr 02, 2021 at 06:40:20PM +0900, Namhyung Kim escreveu:
> It's confusing which one is effective when the both options are given.
> The current code happens to use -c in this case but users might not be
> aware of it.  We can change it to complain about that instead of
> relying on the implicit priority.
> 
> Before:
>   $ perf record -c 11 -F 99 true
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.031 MB perf.data (8 samples) ]
> 
>   $ perf evlist -F
>   cycles: sample_period=11
> 
> After:
>   $ perf record -c 11 -F 99 true
>   cannot set frequency and period at the same time
> 
> So this change can break existing usages, but I think it's rare to
> have both options and it'd be better changing them.

Humm, perhaps we can just make that an warning stating that -c is used
if both are specified?

$ perf record -c 11 -F 99 true
Frequency and period can't be used the same time, -c 1 will be used.

- Arnaldo
 
> Suggested-by: Alexey Alexandrov 
> Signed-off-by: Namhyung Kim 
> ---
>  tools/perf/util/record.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
> index f99852d54b14..43e5b563dee8 100644
> --- a/tools/perf/util/record.c
> +++ b/tools/perf/util/record.c
> @@ -157,9 +157,15 @@ static int get_max_rate(unsigned int *rate)
>  static int record_opts__config_freq(struct record_opts *opts)
>  {
>   bool user_freq = opts->user_freq != UINT_MAX;
> + bool user_interval = opts->user_interval != ULLONG_MAX;
>   unsigned int max_rate;
>  
> - if (opts->user_interval != ULLONG_MAX)
> + if (user_interval && user_freq) {
> + pr_err("cannot set frequency and period at the same time\n");
> + return -1;
> + }
> +
> + if (user_interval)
>   opts->default_interval = opts->user_interval;
>   if (user_freq)
>   opts->freq = opts->user_freq;
> -- 
> 2.31.0.208.g409f899ff0-goog
> 

-- 

- Arnaldo


Re: [PATCH v3 3/3] ima: enable loading of build time generated key on .ima keyring

2021-04-02 Thread Stefan Berger



On 3/30/21 9:16 AM, Nayna Jain wrote:

The kernel currently only loads the kernel module signing key onto the
builtin trusted keyring. Load the module signing key onto the IMA keyring
as well.

Signed-off-by: Nayna Jain 

Acked-by: Stefan Berger 

---
  certs/system_certificates.S   | 13 +-
  certs/system_keyring.c| 47 +++
  include/keys/system_keyring.h |  7 ++
  security/integrity/digsig.c   |  2 ++
  4 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/certs/system_certificates.S b/certs/system_certificates.S
index 8f29058adf93..dcad27ea8527 100644
--- a/certs/system_certificates.S
+++ b/certs/system_certificates.S
@@ -8,9 +8,11 @@
.globl system_certificate_list
  system_certificate_list:
  __cert_list_start:
-#ifdef CONFIG_MODULE_SIG
+__module_cert_start:
+#if defined(CONFIG_MODULE_SIG) || defined(CONFIG_IMA_APPRAISE_MODSIG)
.incbin "certs/signing_key.x509"
  #endif
+__module_cert_end:
.incbin "certs/x509_certificate_list"
  __cert_list_end:
  
@@ -35,3 +37,12 @@ system_certificate_list_size:

  #else
.long __cert_list_end - __cert_list_start
  #endif
+
+   .align 8
+   .globl module_cert_size
+module_cert_size:
+#ifdef CONFIG_64BIT
+   .quad __module_cert_end - __module_cert_start
+#else
+   .long __module_cert_end - __module_cert_start
+#endif
diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 4b693da488f1..bb122bf4cc17 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -27,6 +27,7 @@ static struct key *platform_trusted_keys;
  
  extern __initconst const u8 system_certificate_list[];

  extern __initconst const unsigned long system_certificate_list_size;
+extern __initconst const unsigned long module_cert_size;
  
  /**

   * restrict_link_to_builtin_trusted - Restrict keyring addition by built in CA
@@ -132,19 +133,11 @@ static __init int system_trusted_keyring_init(void)
   */
  device_initcall(system_trusted_keyring_init);
  
-/*

- * Load the compiled-in list of X.509 certificates.
- */
-static __init int load_system_certificate_list(void)
+static __init int load_cert(const u8 *p, const u8 *end, struct key *keyring)
  {
key_ref_t key;
-   const u8 *p, *end;
size_t plen;
  
-	pr_notice("Loading compiled-in X.509 certificates\n");

-
-   p = system_certificate_list;
-   end = p + system_certificate_list_size;
while (p < end) {
/* Each cert begins with an ASN.1 SEQUENCE tag and must be more
 * than 256 bytes in size.
@@ -159,7 +152,7 @@ static __init int load_system_certificate_list(void)
if (plen > end - p)
goto dodgy_cert;
  
-		key = key_create_or_update(make_key_ref(builtin_trusted_keys, 1),

+   key = key_create_or_update(make_key_ref(keyring, 1),
   "asymmetric",
   NULL,
   p,
@@ -186,6 +179,40 @@ static __init int load_system_certificate_list(void)
pr_err("Problem parsing in-kernel X.509 certificate list\n");
return 0;
  }
+
+__init int load_module_cert(struct key *keyring)
+{
+   const u8 *p, *end;
+
+   if (!IS_ENABLED(CONFIG_IMA_APPRAISE_MODSIG))
+   return 0;
+
+   pr_notice("Loading compiled-in module X.509 certificates\n");
+
+   p = system_certificate_list;
+   end = p + module_cert_size;
+
+   return load_cert(p, end, keyring);
+}
+
+/*
+ * Load the compiled-in list of X.509 certificates.
+ */
+static __init int load_system_certificate_list(void)
+{
+   const u8 *p, *end;
+
+   pr_notice("Loading compiled-in X.509 certificates\n");
+
+#ifdef CONFIG_MODULE_SIG
+   p = system_certificate_list;
+#else
+   p = system_certificate_list + module_cert_size;
+#endif
+
+   end = p + system_certificate_list_size;
+   return load_cert(p, end, builtin_trusted_keys);
+}
  late_initcall(load_system_certificate_list);
  
  #ifdef CONFIG_SYSTEM_DATA_VERIFICATION

diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h
index fb8b07daa9d1..f954276c616a 100644
--- a/include/keys/system_keyring.h
+++ b/include/keys/system_keyring.h
@@ -16,9 +16,16 @@ extern int restrict_link_by_builtin_trusted(struct key 
*keyring,
const struct key_type *type,
const union key_payload *payload,
struct key *restriction_key);
+extern __init int load_module_cert(struct key *keyring);
  
  #else

  #define restrict_link_by_builtin_trusted restrict_link_reject
+
+static inline __init int load_module_cert(struct key *keyring)
+{
+   return 0;
+}
+
  #endif
  
  #ifdef CONFIG_SECONDARY_TRUSTED_KEYRING

diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c
index 250fb0836156..3b06a01bd0fd 100644
--- 

Re: [PATCH v3 2/3] ima: enable signing of modules with build time generated key

2021-04-02 Thread Stefan Berger



On 3/30/21 9:16 AM, Nayna Jain wrote:

The kernel build process currently only signs kernel modules when
MODULE_SIG is enabled. Also, sign the kernel modules at build time when
IMA_APPRAISE_MODSIG is enabled.

Signed-off-by: Nayna Jain 

Acked-by: Stefan Berger 

---
  certs/Kconfig  | 2 +-
  certs/Makefile | 8 
  init/Kconfig   | 6 +++---
  3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/certs/Kconfig b/certs/Kconfig
index c94e93d8bccf..48675ad319db 100644
--- a/certs/Kconfig
+++ b/certs/Kconfig
@@ -4,7 +4,7 @@ menu "Certificates for signature checking"
  config MODULE_SIG_KEY
string "File name or PKCS#11 URI of module signing key"
default "certs/signing_key.pem"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
   Provide the file name of a private key/certificate in PEM format,
   or a PKCS#11 URI according to RFC7512. The file should contain, or
diff --git a/certs/Makefile b/certs/Makefile
index f4c25b67aad9..e3185c57fbd8 100644
--- a/certs/Makefile
+++ b/certs/Makefile
@@ -32,6 +32,14 @@ endif # CONFIG_SYSTEM_TRUSTED_KEYRING
  clean-files := x509_certificate_list .x509.list
  
  ifeq ($(CONFIG_MODULE_SIG),y)

+   SIGN_KEY = y
+endif
+
+ifeq ($(CONFIG_IMA_APPRAISE_MODSIG),y)
+   SIGN_KEY = y
+endif
+
+ifdef SIGN_KEY
  
###
  #
  # If module signing is requested, say by allyesconfig, but a key has not been
diff --git a/init/Kconfig b/init/Kconfig
index 5f5c776ef192..85e48a578f90 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2164,7 +2164,7 @@ config MODULE_SIG_FORCE
  config MODULE_SIG_ALL
bool "Automatically sign all modules"
default y
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  Sign all modules during make modules_install. Without this option,
  modules must be signed manually, using the scripts/sign-file tool.
@@ -2174,7 +2174,7 @@ comment "Do not forget to sign required modules with 
scripts/sign-file"
  
  choice

prompt "Which hash algorithm should modules be signed with?"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  This determines which sort of hashing algorithm will be used during
  signature generation.  This algorithm _must_ be built into the kernel
@@ -2206,7 +2206,7 @@ endchoice
  
  config MODULE_SIG_HASH

string
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
default "sha1" if MODULE_SIG_SHA1
default "sha224" if MODULE_SIG_SHA224
default "sha256" if MODULE_SIG_SHA256


[PATCH] perf record: Disallow -c and -F option at the same time

2021-04-02 Thread Namhyung Kim
It's confusing which one is effective when the both options are given.
The current code happens to use -c in this case but users might not be
aware of it.  We can change it to complain about that instead of
relying on the implicit priority.

Before:
  $ perf record -c 11 -F 99 true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.031 MB perf.data (8 samples) ]

  $ perf evlist -F
  cycles: sample_period=11

After:
  $ perf record -c 11 -F 99 true
  cannot set frequency and period at the same time

So this change can break existing usages, but I think it's rare to
have both options and it'd be better changing them.

Suggested-by: Alexey Alexandrov 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/record.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index f99852d54b14..43e5b563dee8 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -157,9 +157,15 @@ static int get_max_rate(unsigned int *rate)
 static int record_opts__config_freq(struct record_opts *opts)
 {
bool user_freq = opts->user_freq != UINT_MAX;
+   bool user_interval = opts->user_interval != ULLONG_MAX;
unsigned int max_rate;
 
-   if (opts->user_interval != ULLONG_MAX)
+   if (user_interval && user_freq) {
+   pr_err("cannot set frequency and period at the same time\n");
+   return -1;
+   }
+
+   if (user_interval)
opts->default_interval = opts->user_interval;
if (user_freq)
opts->freq = opts->user_freq;
-- 
2.31.0.208.g409f899ff0-goog



[PATCH 3/3] powerpc/32s: Define a MODULE area below kernel text all the time

2021-04-01 Thread Christophe Leroy
On book3s/32, the segment below kernel text is used for module
allocation when CONFIG_STRICT_KERNEL_RWX is defined.

In order to benefit from the powerpc specific module_alloc()
function which allocate modules with 32 Mbytes from
end of kernel text, use that segment below PAGE_OFFSET at all time.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig | 2 +-
 arch/powerpc/include/asm/book3s/32/pgtable.h | 2 --
 arch/powerpc/mm/book3s32/mmu.c   | 7 ---
 3 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c1344c05226c..15a91202d5c3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1219,7 +1219,7 @@ config TASK_SIZE_BOOL
 config TASK_SIZE
hex "Size of user task space" if TASK_SIZE_BOOL
default "0x8000" if PPC_8xx
-   default "0xb000" if PPC_BOOK3S_32 && STRICT_KERNEL_RWX
+   default "0xb000" if PPC_BOOK3S_32
default "0xc000"
 endmenu
 
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 415ae29fa73a..83c65845a1a9 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -194,10 +194,8 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, 
pgprot_t prot);
 #define VMALLOC_ENDioremap_bot
 #endif
 
-#ifdef CONFIG_STRICT_KERNEL_RWX
 #define MODULES_ENDALIGN_DOWN(PAGE_OFFSET, SZ_256M)
 #define MODULES_VADDR  (MODULES_END - SZ_256M)
-#endif
 
 #ifndef __ASSEMBLY__
 #include 
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index a0db398b5c26..159930351d9f 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -184,17 +184,10 @@ static bool is_module_segment(unsigned long addr)
 {
if (!IS_ENABLED(CONFIG_MODULES))
return false;
-#ifdef MODULES_VADDR
if (addr < ALIGN_DOWN(MODULES_VADDR, SZ_256M))
return false;
if (addr > ALIGN(MODULES_END, SZ_256M) - 1)
return false;
-#else
-   if (addr < ALIGN_DOWN(VMALLOC_START, SZ_256M))
-   return false;
-   if (addr > ALIGN(VMALLOC_END, SZ_256M) - 1)
-   return false;
-#endif
return true;
 }
 
-- 
2.25.0



[PATCH RESEND v1 4/4] powerpc/vdso: Add support for time namespaces

2021-03-31 Thread Christophe Leroy
This patch adds the necessary glue to provide time namespaces.

Things are mainly copied from ARM64.

__arch_get_timens_vdso_data() calculates timens vdso data position
based on the vdso data position, knowing it is the next page in vvar.
This avoids having to redo the mflr/bcl/mflr/mtlr dance to locate
the page relative to running code position.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig |   3 +-
 arch/powerpc/include/asm/vdso/gettimeofday.h |  10 ++
 arch/powerpc/include/asm/vdso_datapage.h |   2 -
 arch/powerpc/kernel/vdso.c   | 116 ---
 arch/powerpc/kernel/vdso32/vdso32.lds.S  |   2 +-
 arch/powerpc/kernel/vdso64/vdso64.lds.S  |   2 +-
 6 files changed, 114 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c1344c05226c..71daff5f15d5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -172,6 +172,7 @@ config PPC
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES  if PPC_BARRIER_NOSPEC
select GENERIC_EARLY_IOREMAP
+   select GENERIC_GETTIMEOFDAY
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_PCI_IOMAPif PCI
@@ -179,7 +180,7 @@ config PPC
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
-   select GENERIC_GETTIMEOFDAY
+   select GENERIC_VDSO_TIME_NS
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
PPC_RADIX_MMU
select HAVE_ARCH_JUMP_LABEL
diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
b/arch/powerpc/include/asm/vdso/gettimeofday.h
index d453e725c79f..e448df1dd071 100644
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
 #define _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
 
+#include 
+
 #ifdef __ASSEMBLY__
 
 #include 
@@ -153,6 +155,14 @@ static __always_inline u64 __arch_get_hw_counter(s32 
clock_mode,
 
 const struct vdso_data *__arch_get_vdso_data(void);
 
+#ifdef CONFIG_TIME_NS
+static __always_inline
+const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd)
+{
+   return (void *)vd + PAGE_SIZE;
+}
+#endif
+
 static inline bool vdso_clocksource_ok(const struct vdso_data *vd)
 {
return true;
diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
b/arch/powerpc/include/asm/vdso_datapage.h
index 3f958ecf2beb..a585c8e538ff 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -107,9 +107,7 @@ extern struct vdso_arch_data *vdso_data;
bcl 20, 31, .+4
 999:
mflr\ptr
-#if CONFIG_PPC_PAGE_SHIFT > 14
addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
-#endif
addi\ptr, \ptr, (_vdso_datapage - 999b)@l
 .endm
 
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index b14907209822..717f2c9a7573 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -50,6 +51,12 @@ static union {
 } vdso_data_store __page_aligned_data;
 struct vdso_arch_data *vdso_data = _data_store.data;
 
+enum vvar_pages {
+   VVAR_DATA_PAGE_OFFSET,
+   VVAR_TIMENS_PAGE_OFFSET,
+   VVAR_NR_PAGES,
+};
+
 static int vdso_mremap(const struct vm_special_mapping *sm, struct 
vm_area_struct *new_vma,
   unsigned long text_size)
 {
@@ -73,8 +80,12 @@ static int vdso64_mremap(const struct vm_special_mapping 
*sm, struct vm_area_str
return vdso_mremap(sm, new_vma, _end - _start);
 }
 
+static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
+struct vm_area_struct *vma, struct vm_fault *vmf);
+
 static struct vm_special_mapping vvar_spec __ro_after_init = {
.name = "[vvar]",
+   .fault = vvar_fault,
 };
 
 static struct vm_special_mapping vdso32_spec __ro_after_init = {
@@ -87,6 +98,94 @@ static struct vm_special_mapping vdso64_spec __ro_after_init 
= {
.mremap = vdso64_mremap,
 };
 
+#ifdef CONFIG_TIME_NS
+struct vdso_data *arch_get_vdso_data(void *vvar_page)
+{
+   return ((struct vdso_arch_data *)vvar_page)->data;
+}
+
+/*
+ * The vvar mapping contains data for a specific time namespace, so when a task
+ * changes namespace we must unmap its vvar data for the old namespace.
+ * Subsequent faults will map in data for the new namespace.
+ *
+ * For more details see timens_setup_vdso_data().
+ */
+int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
+{
+   struct mm_struct *mm = task->mm;
+   struct vm_area_struct *vma;
+
+   mmap_read_lock(mm);
+
+   for (vma = mm->mmap; vma; vma = vma->vm_next) {
+   unsigned long s

[PATCH RESEND v1 0/4] powerpc/vdso: Add support for time namespaces

2021-03-31 Thread Christophe Leroy
[Sorry, resending with complete destination list, I used the wrong script on 
the first delivery]

This series adds support for time namespaces on powerpc.

All timens selftests are successfull.

Christophe Leroy (3):
  lib/vdso: Mark do_hres_timens() and do_coarse_timens()
__always_inline()
  lib/vdso: Add vdso_data pointer as input to
__arch_get_timens_vdso_data()
  powerpc/vdso: Add support for time namespaces

Dmitry Safonov (1):
  powerpc/vdso: Separate vvar vma from vdso

 .../include/asm/vdso/compat_gettimeofday.h|   3 +-
 arch/arm64/include/asm/vdso/gettimeofday.h|   2 +-
 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/mmu_context.h|   2 +-
 arch/powerpc/include/asm/vdso/gettimeofday.h  |  10 ++
 arch/powerpc/include/asm/vdso_datapage.h  |   2 -
 arch/powerpc/kernel/vdso.c| 138 --
 arch/powerpc/kernel/vdso32/vdso32.lds.S   |   2 +-
 arch/powerpc/kernel/vdso64/vdso64.lds.S   |   2 +-
 arch/s390/include/asm/vdso/gettimeofday.h |   3 +-
 arch/x86/include/asm/vdso/gettimeofday.h  |   3 +-
 lib/vdso/gettimeofday.c   |  31 ++--
 12 files changed, 162 insertions(+), 39 deletions(-)

-- 
2.25.0



[PATCH v1 4/4] powerpc/vdso: Add support for time namespaces

2021-03-31 Thread Christophe Leroy
This patch adds the necessary glue to provide time namespaces.

Things are mainly copied from ARM64.

__arch_get_timens_vdso_data() calculates timens vdso data position
based on the vdso data position, knowing it is the next page in vvar.
This avoids having to redo the mflr/bcl/mflr/mtlr dance to locate
the page relative to running code position.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig |   3 +-
 arch/powerpc/include/asm/vdso/gettimeofday.h |  10 ++
 arch/powerpc/include/asm/vdso_datapage.h |   2 -
 arch/powerpc/kernel/vdso.c   | 116 ---
 arch/powerpc/kernel/vdso32/vdso32.lds.S  |   2 +-
 arch/powerpc/kernel/vdso64/vdso64.lds.S  |   2 +-
 6 files changed, 114 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c1344c05226c..71daff5f15d5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -172,6 +172,7 @@ config PPC
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES  if PPC_BARRIER_NOSPEC
select GENERIC_EARLY_IOREMAP
+   select GENERIC_GETTIMEOFDAY
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_PCI_IOMAPif PCI
@@ -179,7 +180,7 @@ config PPC
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
-   select GENERIC_GETTIMEOFDAY
+   select GENERIC_VDSO_TIME_NS
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
PPC_RADIX_MMU
select HAVE_ARCH_JUMP_LABEL
diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
b/arch/powerpc/include/asm/vdso/gettimeofday.h
index d453e725c79f..e448df1dd071 100644
--- a/arch/powerpc/include/asm/vdso/gettimeofday.h
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
 #define _ASM_POWERPC_VDSO_GETTIMEOFDAY_H
 
+#include 
+
 #ifdef __ASSEMBLY__
 
 #include 
@@ -153,6 +155,14 @@ static __always_inline u64 __arch_get_hw_counter(s32 
clock_mode,
 
 const struct vdso_data *__arch_get_vdso_data(void);
 
+#ifdef CONFIG_TIME_NS
+static __always_inline
+const struct vdso_data *__arch_get_timens_vdso_data(const struct vdso_data *vd)
+{
+   return (void *)vd + PAGE_SIZE;
+}
+#endif
+
 static inline bool vdso_clocksource_ok(const struct vdso_data *vd)
 {
return true;
diff --git a/arch/powerpc/include/asm/vdso_datapage.h 
b/arch/powerpc/include/asm/vdso_datapage.h
index 3f958ecf2beb..a585c8e538ff 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -107,9 +107,7 @@ extern struct vdso_arch_data *vdso_data;
bcl 20, 31, .+4
 999:
mflr\ptr
-#if CONFIG_PPC_PAGE_SHIFT > 14
addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
-#endif
addi\ptr, \ptr, (_vdso_datapage - 999b)@l
 .endm
 
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index b14907209822..717f2c9a7573 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -50,6 +51,12 @@ static union {
 } vdso_data_store __page_aligned_data;
 struct vdso_arch_data *vdso_data = _data_store.data;
 
+enum vvar_pages {
+   VVAR_DATA_PAGE_OFFSET,
+   VVAR_TIMENS_PAGE_OFFSET,
+   VVAR_NR_PAGES,
+};
+
 static int vdso_mremap(const struct vm_special_mapping *sm, struct 
vm_area_struct *new_vma,
   unsigned long text_size)
 {
@@ -73,8 +80,12 @@ static int vdso64_mremap(const struct vm_special_mapping 
*sm, struct vm_area_str
return vdso_mremap(sm, new_vma, _end - _start);
 }
 
+static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
+struct vm_area_struct *vma, struct vm_fault *vmf);
+
 static struct vm_special_mapping vvar_spec __ro_after_init = {
.name = "[vvar]",
+   .fault = vvar_fault,
 };
 
 static struct vm_special_mapping vdso32_spec __ro_after_init = {
@@ -87,6 +98,94 @@ static struct vm_special_mapping vdso64_spec __ro_after_init 
= {
.mremap = vdso64_mremap,
 };
 
+#ifdef CONFIG_TIME_NS
+struct vdso_data *arch_get_vdso_data(void *vvar_page)
+{
+   return ((struct vdso_arch_data *)vvar_page)->data;
+}
+
+/*
+ * The vvar mapping contains data for a specific time namespace, so when a task
+ * changes namespace we must unmap its vvar data for the old namespace.
+ * Subsequent faults will map in data for the new namespace.
+ *
+ * For more details see timens_setup_vdso_data().
+ */
+int vdso_join_timens(struct task_struct *task, struct time_namespace *ns)
+{
+   struct mm_struct *mm = task->mm;
+   struct vm_area_struct *vma;
+
+   mmap_read_lock(mm);
+
+   for (vma = mm->mmap; vma; vma = vma->vm_next) {
+   unsigned long s

[PATCH v1 0/4] powerpc/vdso: Add support for time namespaces

2021-03-31 Thread Christophe Leroy
This series adds support for time namespaces on powerpc.

All timens selftests are successfull.

Christophe Leroy (3):
  lib/vdso: Mark do_hres_timens() and do_coarse_timens()
__always_inline()
  lib/vdso: Add vdso_data pointer as input to
__arch_get_timens_vdso_data()
  powerpc/vdso: Add support for time namespaces

Dmitry Safonov (1):
  powerpc/vdso: Separate vvar vma from vdso

 .../include/asm/vdso/compat_gettimeofday.h|   3 +-
 arch/arm64/include/asm/vdso/gettimeofday.h|   2 +-
 arch/powerpc/Kconfig  |   3 +-
 arch/powerpc/include/asm/mmu_context.h|   2 +-
 arch/powerpc/include/asm/vdso/gettimeofday.h  |  10 ++
 arch/powerpc/include/asm/vdso_datapage.h  |   2 -
 arch/powerpc/kernel/vdso.c| 138 --
 arch/powerpc/kernel/vdso32/vdso32.lds.S   |   2 +-
 arch/powerpc/kernel/vdso64/vdso64.lds.S   |   2 +-
 arch/s390/include/asm/vdso/gettimeofday.h |   3 +-
 arch/x86/include/asm/vdso/gettimeofday.h  |   3 +-
 lib/vdso/gettimeofday.c   |  31 ++--
 12 files changed, 162 insertions(+), 39 deletions(-)

-- 
2.25.0



Re: [PATCH v3 1/3] keys: cleanup build time module signing keys

2021-03-30 Thread Jarkko Sakkinen
On Tue, Mar 30, 2021 at 09:16:34AM -0400, Nayna Jain wrote:
> The "mrproper" target is still looking for build time generated keys in
> the kernel root directory instead of certs directory. Fix the path and
> remove the names of the files which are no longer generated.
> 
> Fixes: cfc411e7fff3 ("Move certificate handling to its own directory")
> Signed-off-by: Nayna Jain 
> Reviewed-by: Stefan Berger 
> Reviewed-by: Mimi Zohar 
> ---
>  Makefile | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index d4784d181123..b7c2ed2a8684 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1523,9 +1523,9 @@ MRPROPER_FILES += include/config include/generated  
> \
> debian snap tar-install \
> .config .config.old .version \
> Module.symvers \
> -   signing_key.pem signing_key.priv signing_key.x509 \
> -   x509.genkey extra_certificates signing_key.x509.keyid \
> -   signing_key.x509.signer vmlinux-gdb.py \
> +   certs/signing_key.pem certs/signing_key.x509 \
> +   certs/x509.genkey \
> +   vmlinux-gdb.py \
> *.spec
>  
>  # Directories & files removed with 'make distclean'
> -- 
> 2.29.2
> 
> 



Reviewed-by: Jarkko Sakkinen 

/Jarkko


[RFC v2 42/43] shmem: reduce time holding xa_lock when inserting pages

2021-03-30 Thread Anthony Yznaga
Rather than adding one page at a time to the page cache and taking the
page cache xarray lock each time, where possible add pages in bulk by
first populating an xarray node outside of the page cache before taking
the lock to insert it.
When a group of pages to be inserted will fill an xarray node, add them
to a local xarray, export the xarray node, and then take the lock on the
page cache xarray and insert the node.

Signed-off-by: Anthony Yznaga 
---
 mm/shmem.c | 162 ++---
 1 file changed, 156 insertions(+), 6 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index f495af51042e..a7c23b43b57f 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -827,17 +827,149 @@ static void shmem_delete_from_page_cache(struct page 
*page, void *radswap)
BUG_ON(error);
 }
 
+static int shmem_add_aligned_to_page_cache(struct page *pages[], int npages,
+  struct address_space *mapping,
+  pgoff_t index, gfp_t gfp, int order,
+  struct mm_struct *charge_mm)
+{
+   int xa_shift = order + XA_CHUNK_SHIFT - (order % XA_CHUNK_SHIFT);
+   XA_STATE_ORDER(xas, >i_pages, index, xa_shift);
+   struct xarray xa_tmp;
+   /*
+* Specify order so xas_create_range() only needs to be called once
+* to allocate the entire range.  This guarantees that xas_store()
+* will not fail due to lack of memory.
+* Specify index == 0 so the minimum necessary nodes are allocated.
+*/
+   XA_STATE_ORDER(xas_tmp, _tmp, 0, xa_shift);
+   unsigned long nr = 1UL << order;
+   struct xa_node *node;
+   int i, error;
+
+   if (npages * nr != 1 << xa_shift) {
+   WARN_ONCE(1, "npages (%d) not aligned to xa_shift\n", npages);
+   return -EINVAL;
+   }
+   if (!IS_ALIGNED(index, 1 << xa_shift)) {
+   WARN_ONCE(1, "index (%lu) not aligned to xa_shift\n", index);
+   return -EINVAL;
+   }
+
+   for (i = 0; i < npages; i++) {
+   bool skipcharge = page_memcg(pages[i]) ? true : false;
+
+   VM_BUG_ON_PAGE(PageTail(pages[i]), pages[i]);
+   VM_BUG_ON_PAGE(!PageLocked(pages[i]), pages[i]);
+   VM_BUG_ON_PAGE(!PageSwapBacked(pages[i]), pages[i]);
+
+   page_ref_add(pages[i], nr);
+   pages[i]->mapping = mapping;
+   pages[i]->index = index + (i * nr);
+
+   if (!skipcharge && !PageSwapCache(pages[i])) {
+   error = mem_cgroup_charge(pages[i], charge_mm, gfp);
+   if (error) {
+   if (PageTransHuge(pages[i])) {
+   count_vm_event(THP_FILE_FALLBACK);
+   
count_vm_event(THP_FILE_FALLBACK_CHARGE);
+   }
+   goto error;
+   }
+   }
+   cgroup_throttle_swaprate(pages[i], gfp);
+   }
+
+   xa_init(_tmp);
+   do {
+   xas_lock(_tmp);
+   xas_create_range(_tmp);
+   if (xas_error(_tmp))
+   goto unlock;
+   for (i = 0; i < npages; i++) {
+   int j = 0;
+next:
+   xas_store(_tmp, pages[i]);
+   if (++j < nr) {
+   xas_next(_tmp);
+   goto next;
+   }
+   if (i < npages - 1)
+   xas_next(_tmp);
+   }
+   xas_set_order(_tmp, 0, xa_shift);
+   node = xas_export_node(_tmp);
+unlock:
+   xas_unlock(_tmp);
+   } while (xas_nomem(_tmp, gfp));
+
+   if (xas_error(_tmp)) {
+   error = xas_error(_tmp);
+   i = npages - 1;
+   goto error;
+   }
+
+   do {
+   xas_lock_irq();
+   xas_import_node(, node);
+   if (xas_error())
+   goto unlock1;
+   mapping->nrpages += nr * npages;
+   xas_unlock();
+   for (i = 0; i < npages; i++) {
+   __mod_lruvec_page_state(pages[i], NR_FILE_PAGES, nr);
+   __mod_lruvec_page_state(pages[i], NR_SHMEM, nr);
+   if (PageTransHuge(pages[i])) {
+   count_vm_event(THP_FILE_ALLOC);
+   __inc_node_page_state(pages[i], NR_SHMEM_THPS);
+   }
+   }
+   local_irq_enable();
+   break;
+unlock1:
+   xas_unlock_irq();
+   } while (xas_nomem(, gfp));
+
+   if (xas_error()) {
+   error = xas_error();
+

[RFC v2 32/43] shmem: preserve shmem files a chunk at a time

2021-03-30 Thread Anthony Yznaga
To prepare for multithreading the work to preserve a shmem file,
divide the work into subranges of the total index range of the file.
The chunk size is a rather arbitrary 256k indices.

Signed-off-by: Anthony Yznaga 
---
 mm/shmem_pkram.c | 64 +---
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/mm/shmem_pkram.c b/mm/shmem_pkram.c
index 8682b0c002c0..e52722b3a709 100644
--- a/mm/shmem_pkram.c
+++ b/mm/shmem_pkram.c
@@ -74,16 +74,14 @@ static int save_page(struct page *page, struct pkram_access 
*pa)
return err;
 }
 
-static int save_file_content(struct pkram_stream *ps, struct address_space 
*mapping)
+static int save_file_content_range(struct pkram_access *pa,
+  struct address_space *mapping,
+  unsigned long start, unsigned long end)
 {
-   PKRAM_ACCESS(pa, ps, pages);
struct pagevec pvec;
-   unsigned long start, end;
int err = 0;
int i;
 
-   start = 0;
-   end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
pagevec_init();
for ( ; ; ) {
pvec.nr = find_get_pages_range(mapping, , end,
@@ -95,7 +93,7 @@ static int save_file_content(struct pkram_stream *ps, struct 
address_space *mapp
 
lock_page(page);
BUG_ON(page->mapping != mapping);
-   err = save_page(page, );
+   err = save_page(page, pa);
if (PageCompound(page)) {
start = page->index + compound_nr(page);
i += compound_nr(page);
@@ -113,10 +111,62 @@ static int save_file_content(struct pkram_stream *ps, 
struct address_space *mapp
cond_resched();
}
 
-   pkram_finish_access(, err == 0);
return err;
 }
 
+struct shmem_pkram_arg {
+   struct pkram_stream *ps;
+   struct address_space *mapping;
+   struct mm_struct *mm;
+   atomic64_t next;
+};
+
+unsigned long shmem_pkram_max_index_range = 512 * 512;
+
+static int get_save_range(unsigned long max, atomic64_t *next, unsigned long 
*start, unsigned long *end)
+{
+   unsigned long index;
+ 
+   index = atomic64_fetch_add(shmem_pkram_max_index_range, next);
+   if (index >= max)
+   return -ENODATA;
+ 
+   *start = index;
+   *end = index + shmem_pkram_max_index_range - 1;
+ 
+   return 0;
+}
+
+static int do_save_file_content(struct pkram_stream *ps,
+   struct address_space *mapping,
+   atomic64_t *next)
+{
+   PKRAM_ACCESS(pa, ps, pages);
+   unsigned long start, end, max;
+   int ret;
+ 
+   max = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
+ 
+   do {
+   ret = get_save_range(max, next, , );
+   if (!ret)
+   ret = save_file_content_range(, mapping, start, end);
+   } while (!ret);
+ 
+   if (ret == -ENODATA)
+   ret = 0;
+ 
+   pkram_finish_access(, ret == 0);
+   return ret;
+}
+
+static int save_file_content(struct pkram_stream *ps, struct address_space 
*mapping)
+{
+   struct shmem_pkram_arg arg = { ps, mapping, NULL, ATOMIC64_INIT(0) };
+ 
+   return do_save_file_content(arg.ps, arg.mapping, );
+}
+
 static int save_file(struct dentry *dentry, struct pkram_stream *ps)
 {
PKRAM_ACCESS(pa_bytes, ps, bytes);
-- 
1.8.3.1



Re: [PATCH net-next 1/1] stmmac: intel: add cross time-stamping freq difference adjustment

2021-03-30 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net-next.git (refs/heads/master):

On Tue, 30 Mar 2021 10:46:53 +0800 you wrote:
> Cross time-stamping mechanism used in certain instance of Intel mGbE
> may run at different clock frequency in comparison to the clock
> frequency used by processor, so we introduce cross T/S frequency
> adjustment to ensure TSC calculation is correct when processor got the
> cross time-stamps.
> 
> Signed-off-by: Wong Vee Khee 
> 
> [...]

Here is the summary with links:
  - [net-next,1/1] stmmac: intel: add cross time-stamping freq difference 
adjustment
https://git.kernel.org/netdev/net-next/c/1c137d4777b5

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




[PATCH v19 4/7] time: Add mechanism to recognize clocksource in time_get_snapshot

2021-03-30 Thread Marc Zyngier
From: Thomas Gleixner 

System time snapshots are not conveying information about the current
clocksource which was used, but callers like the PTP KVM guest
implementation have the requirement to evaluate the clocksource type to
select the appropriate mechanism.

Introduce a clocksource id field in struct clocksource which is by default
set to CSID_GENERIC (0). Clocksource implementations can set that field to
a value which allows to identify the clocksource.

Store the clocksource id of the current clocksource in the
system_time_snapshot so callers can evaluate which clocksource was used to
take the snapshot and act accordingly.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Jianyong Wu 
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20201209060932.212364-5-jianyong...@arm.com
---
 include/linux/clocksource.h |  6 ++
 include/linux/clocksource_ids.h | 11 +++
 include/linux/timekeeping.h | 12 +++-
 kernel/time/clocksource.c   |  2 ++
 kernel/time/timekeeping.c   |  1 +
 5 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/clocksource_ids.h

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143db6523..1290d0dce840 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -62,6 +63,10 @@ struct module;
  * 400-499: Perfect
  * The ideal clocksource. A must-use where
  * available.
+ * @id:Defaults to CSID_GENERIC. The id value is 
captured
+ * in certain snapshot functions to allow callers to
+ * validate the clocksource from which the snapshot was
+ * taken.
  * @flags: Flags describing special properties
  * @enable:Optional function to enable the clocksource
  * @disable:   Optional function to disable the clocksource
@@ -100,6 +105,7 @@ struct clocksource {
const char  *name;
struct list_headlist;
int rating;
+   enum clocksource_idsid;
enum vdso_clock_modevdso_clock_mode;
unsigned long   flags;
 
diff --git a/include/linux/clocksource_ids.h b/include/linux/clocksource_ids.h
new file mode 100644
index ..4d8e19e05328
--- /dev/null
+++ b/include/linux/clocksource_ids.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_CLOCKSOURCE_IDS_H
+#define _LINUX_CLOCKSOURCE_IDS_H
+
+/* Enum to give clocksources a unique identifier */
+enum clocksource_ids {
+   CSID_GENERIC= 0,
+   CSID_MAX,
+};
+
+#endif
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index c6792cf01bc7..78a98bdff76d 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -3,6 +3,7 @@
 #define _LINUX_TIMEKEEPING_H
 
 #include 
+#include 
 
 /* Included from linux/ktime.h */
 
@@ -243,11 +244,12 @@ struct ktime_timestamps {
  * @cs_was_changed_seq:The sequence number of clocksource change events
  */
 struct system_time_snapshot {
-   u64 cycles;
-   ktime_t real;
-   ktime_t raw;
-   unsigned intclock_was_set_seq;
-   u8  cs_was_changed_seq;
+   u64 cycles;
+   ktime_t real;
+   ktime_t raw;
+   enum clocksource_idscs_id;
+   unsigned intclock_was_set_seq;
+   u8  cs_was_changed_seq;
 };
 
 /**
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cce484a2cc7c..4fe1df894ee5 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -920,6 +920,8 @@ int __clocksource_register_scale(struct clocksource *cs, 
u32 scale, u32 freq)
 
clocksource_arch_init(cs);
 
+   if (WARN_ON_ONCE((unsigned int)cs->id >= CSID_MAX))
+   cs->id = CSID_GENERIC;
if (cs->vdso_clock_mode < 0 ||
cs->vdso_clock_mode >= VDSO_CLOCKMODE_MAX) {
pr_warn("clocksource %s registered with invalid VDSO mode %d. 
Disabling VDSO support.\n",
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6aee5768c86f..06f55f9258bf 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1048,6 +1048,7 @@ void ktime_get_snapshot(struct system_time_snapshot 
*systime_snapshot)
do {
seq = read_seqcount_begin(_core.seq);
now = tk_clock_read(>tkr_mono);
+   systime_snapshot->cs_id = tk->tkr_mono.clock->id;
systime_snapshot->cs_was_changed_seq = tk->cs_was_changed_seq;
systime_snapshot->clock_was_set_seq = tk->clock_was_set_seq;
base_real = ktime_add(tk->tkr_mono.base,
-- 
2.29.2



[PATCH v3 3/3] ima: enable loading of build time generated key on .ima keyring

2021-03-30 Thread Nayna Jain
The kernel currently only loads the kernel module signing key onto the
builtin trusted keyring. Load the module signing key onto the IMA keyring
as well.

Signed-off-by: Nayna Jain 
---
 certs/system_certificates.S   | 13 +-
 certs/system_keyring.c| 47 +++
 include/keys/system_keyring.h |  7 ++
 security/integrity/digsig.c   |  2 ++
 4 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/certs/system_certificates.S b/certs/system_certificates.S
index 8f29058adf93..dcad27ea8527 100644
--- a/certs/system_certificates.S
+++ b/certs/system_certificates.S
@@ -8,9 +8,11 @@
.globl system_certificate_list
 system_certificate_list:
 __cert_list_start:
-#ifdef CONFIG_MODULE_SIG
+__module_cert_start:
+#if defined(CONFIG_MODULE_SIG) || defined(CONFIG_IMA_APPRAISE_MODSIG)
.incbin "certs/signing_key.x509"
 #endif
+__module_cert_end:
.incbin "certs/x509_certificate_list"
 __cert_list_end:
 
@@ -35,3 +37,12 @@ system_certificate_list_size:
 #else
.long __cert_list_end - __cert_list_start
 #endif
+
+   .align 8
+   .globl module_cert_size
+module_cert_size:
+#ifdef CONFIG_64BIT
+   .quad __module_cert_end - __module_cert_start
+#else
+   .long __module_cert_end - __module_cert_start
+#endif
diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 4b693da488f1..bb122bf4cc17 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -27,6 +27,7 @@ static struct key *platform_trusted_keys;
 
 extern __initconst const u8 system_certificate_list[];
 extern __initconst const unsigned long system_certificate_list_size;
+extern __initconst const unsigned long module_cert_size;
 
 /**
  * restrict_link_to_builtin_trusted - Restrict keyring addition by built in CA
@@ -132,19 +133,11 @@ static __init int system_trusted_keyring_init(void)
  */
 device_initcall(system_trusted_keyring_init);
 
-/*
- * Load the compiled-in list of X.509 certificates.
- */
-static __init int load_system_certificate_list(void)
+static __init int load_cert(const u8 *p, const u8 *end, struct key *keyring)
 {
key_ref_t key;
-   const u8 *p, *end;
size_t plen;
 
-   pr_notice("Loading compiled-in X.509 certificates\n");
-
-   p = system_certificate_list;
-   end = p + system_certificate_list_size;
while (p < end) {
/* Each cert begins with an ASN.1 SEQUENCE tag and must be more
 * than 256 bytes in size.
@@ -159,7 +152,7 @@ static __init int load_system_certificate_list(void)
if (plen > end - p)
goto dodgy_cert;
 
-   key = key_create_or_update(make_key_ref(builtin_trusted_keys, 
1),
+   key = key_create_or_update(make_key_ref(keyring, 1),
   "asymmetric",
   NULL,
   p,
@@ -186,6 +179,40 @@ static __init int load_system_certificate_list(void)
pr_err("Problem parsing in-kernel X.509 certificate list\n");
return 0;
 }
+
+__init int load_module_cert(struct key *keyring)
+{
+   const u8 *p, *end;
+
+   if (!IS_ENABLED(CONFIG_IMA_APPRAISE_MODSIG))
+   return 0;
+
+   pr_notice("Loading compiled-in module X.509 certificates\n");
+
+   p = system_certificate_list;
+   end = p + module_cert_size;
+
+   return load_cert(p, end, keyring);
+}
+
+/*
+ * Load the compiled-in list of X.509 certificates.
+ */
+static __init int load_system_certificate_list(void)
+{
+   const u8 *p, *end;
+
+   pr_notice("Loading compiled-in X.509 certificates\n");
+
+#ifdef CONFIG_MODULE_SIG
+   p = system_certificate_list;
+#else
+   p = system_certificate_list + module_cert_size;
+#endif
+
+   end = p + system_certificate_list_size;
+   return load_cert(p, end, builtin_trusted_keys);
+}
 late_initcall(load_system_certificate_list);
 
 #ifdef CONFIG_SYSTEM_DATA_VERIFICATION
diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h
index fb8b07daa9d1..f954276c616a 100644
--- a/include/keys/system_keyring.h
+++ b/include/keys/system_keyring.h
@@ -16,9 +16,16 @@ extern int restrict_link_by_builtin_trusted(struct key 
*keyring,
const struct key_type *type,
const union key_payload *payload,
struct key *restriction_key);
+extern __init int load_module_cert(struct key *keyring);
 
 #else
 #define restrict_link_by_builtin_trusted restrict_link_reject
+
+static inline __init int load_module_cert(struct key *keyring)
+{
+   return 0;
+}
+
 #endif
 
 #ifdef CONFIG_SECONDARY_TRUSTED_KEYRING
diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c
index 250fb0836156..3b06a01bd0fd 100644
--- a/security/integrity/digsig.c
+++ b/security/integrity/digsig.c
@@ -111,6 +111,8 @@ static int 

[PATCH v3 2/3] ima: enable signing of modules with build time generated key

2021-03-30 Thread Nayna Jain
The kernel build process currently only signs kernel modules when
MODULE_SIG is enabled. Also, sign the kernel modules at build time when
IMA_APPRAISE_MODSIG is enabled.

Signed-off-by: Nayna Jain 
---
 certs/Kconfig  | 2 +-
 certs/Makefile | 8 
 init/Kconfig   | 6 +++---
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/certs/Kconfig b/certs/Kconfig
index c94e93d8bccf..48675ad319db 100644
--- a/certs/Kconfig
+++ b/certs/Kconfig
@@ -4,7 +4,7 @@ menu "Certificates for signature checking"
 config MODULE_SIG_KEY
string "File name or PKCS#11 URI of module signing key"
default "certs/signing_key.pem"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  Provide the file name of a private key/certificate in PEM format,
  or a PKCS#11 URI according to RFC7512. The file should contain, or
diff --git a/certs/Makefile b/certs/Makefile
index f4c25b67aad9..e3185c57fbd8 100644
--- a/certs/Makefile
+++ b/certs/Makefile
@@ -32,6 +32,14 @@ endif # CONFIG_SYSTEM_TRUSTED_KEYRING
 clean-files := x509_certificate_list .x509.list
 
 ifeq ($(CONFIG_MODULE_SIG),y)
+   SIGN_KEY = y
+endif
+
+ifeq ($(CONFIG_IMA_APPRAISE_MODSIG),y)
+   SIGN_KEY = y
+endif
+
+ifdef SIGN_KEY
 ###
 #
 # If module signing is requested, say by allyesconfig, but a key has not been
diff --git a/init/Kconfig b/init/Kconfig
index 5f5c776ef192..85e48a578f90 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2164,7 +2164,7 @@ config MODULE_SIG_FORCE
 config MODULE_SIG_ALL
bool "Automatically sign all modules"
default y
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  Sign all modules during make modules_install. Without this option,
  modules must be signed manually, using the scripts/sign-file tool.
@@ -2174,7 +2174,7 @@ comment "Do not forget to sign required modules with 
scripts/sign-file"
 
 choice
prompt "Which hash algorithm should modules be signed with?"
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
help
  This determines which sort of hashing algorithm will be used during
  signature generation.  This algorithm _must_ be built into the kernel
@@ -2206,7 +2206,7 @@ endchoice
 
 config MODULE_SIG_HASH
string
-   depends on MODULE_SIG
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
default "sha1" if MODULE_SIG_SHA1
default "sha224" if MODULE_SIG_SHA224
default "sha256" if MODULE_SIG_SHA256
-- 
2.29.2



[PATCH v3 1/3] keys: cleanup build time module signing keys

2021-03-30 Thread Nayna Jain
The "mrproper" target is still looking for build time generated keys in
the kernel root directory instead of certs directory. Fix the path and
remove the names of the files which are no longer generated.

Fixes: cfc411e7fff3 ("Move certificate handling to its own directory")
Signed-off-by: Nayna Jain 
Reviewed-by: Stefan Berger 
Reviewed-by: Mimi Zohar 
---
 Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index d4784d181123..b7c2ed2a8684 100644
--- a/Makefile
+++ b/Makefile
@@ -1523,9 +1523,9 @@ MRPROPER_FILES += include/config include/generated
  \
  debian snap tar-install \
  .config .config.old .version \
  Module.symvers \
- signing_key.pem signing_key.priv signing_key.x509 \
- x509.genkey extra_certificates signing_key.x509.keyid \
- signing_key.x509.signer vmlinux-gdb.py \
+ certs/signing_key.pem certs/signing_key.x509 \
+ certs/x509.genkey \
+ vmlinux-gdb.py \
  *.spec
 
 # Directories & files removed with 'make distclean'
-- 
2.29.2



Re: [PATCH 01/10] platform/x86: toshiba_acpi: bind life-time of toshiba_acpi_dev to parent

2021-03-30 Thread Alexandru Ardelean
On Mon, 29 Mar 2021 at 17:30, Jonathan Cameron  wrote:
>
> On Wed, 24 Mar 2021 14:55:39 +0200
> Alexandru Ardelean  wrote:
>
> > The 'toshiba_acpi_dev' object is allocated first and free'd last. We can
> > bind it's life-time to the parent ACPI device object. This is a first step
> > in using more device-managed allocated functions for this.
> >
> > The main intent is to try to convert the IIO framework to export only
> > device-managed functions (i.e. devm_iio_device_alloc() and
> > devm_iio_device_register()). It's still not 100% sure that this is
> > possible, but for now, this is the process of taking it slowly in that
> > direction.
> >
> > Signed-off-by: Alexandru Ardelean 
>
> Might just be me, but naming anything dev that isn't a struct device *
> is downright confusing?
>

I found it a bit odd as well, but I decided to not take it in
consideration for now.

>
>
>
> > ---
> >  drivers/platform/x86/toshiba_acpi.c | 6 ++
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/platform/x86/toshiba_acpi.c 
> > b/drivers/platform/x86/toshiba_acpi.c
> > index fa7232ad8c39..6d298810b7bf 100644
> > --- a/drivers/platform/x86/toshiba_acpi.c
> > +++ b/drivers/platform/x86/toshiba_acpi.c
> > @@ -2998,8 +2998,6 @@ static int toshiba_acpi_remove(struct acpi_device 
> > *acpi_dev)
> >   if (toshiba_acpi)
> >   toshiba_acpi = NULL;
> >
> > - kfree(dev);
> > -
> >   return 0;
> >  }
> >
> > @@ -3016,6 +3014,7 @@ static const char *find_hci_method(acpi_handle handle)
> >
> >  static int toshiba_acpi_add(struct acpi_device *acpi_dev)
> >  {
> > + struct device *parent = _dev->dev;
> >   struct toshiba_acpi_dev *dev;
> >   const char *hci_method;
> >   u32 dummy;
> > @@ -3033,7 +3032,7 @@ static int toshiba_acpi_add(struct acpi_device 
> > *acpi_dev)
> >   return -ENODEV;
> >   }
> >
> > - dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> > + dev = devm_kzalloc(parent, sizeof(*dev), GFP_KERNEL);
> >   if (!dev)
> >   return -ENOMEM;
> >   dev->acpi_dev = acpi_dev;
> > @@ -3045,7 +3044,6 @@ static int toshiba_acpi_add(struct acpi_device 
> > *acpi_dev)
> >   ret = misc_register(>miscdev);
> >   if (ret) {
> >   pr_err("Failed to register miscdevice\n");
> > - kfree(dev);
> >   return ret;
> >   }
> >
>


[PATCH net-next 1/1] stmmac: intel: add cross time-stamping freq difference adjustment

2021-03-29 Thread Wong Vee Khee
Cross time-stamping mechanism used in certain instance of Intel mGbE
may run at different clock frequency in comparison to the clock
frequency used by processor, so we introduce cross T/S frequency
adjustment to ensure TSC calculation is correct when processor got the
cross time-stamps.

Signed-off-by: Wong Vee Khee 
---
 .../net/ethernet/stmicro/stmmac/dwmac-intel.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
index 08b4852eed4c..3d9a57043af2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c
@@ -22,8 +22,13 @@
 #define PCH_PTP_CLK_FREQ_19_2MHZ   (GMAC_GPO0)
 #define PCH_PTP_CLK_FREQ_200MHZ(0)
 
+/* Cross-timestamping defines */
+#define ART_CPUID_LEAF 0x15
+#define EHL_PSE_ART_MHZ1920
+
 struct intel_priv_data {
int mdio_adhoc_addr;/* mdio address for serdes & etc */
+   unsigned long crossts_adj;
bool is_pse;
 };
 
@@ -340,9 +345,26 @@ static int intel_crosststamp(ktime_t *device,
*system = convert_art_to_tsc(art_time);
}
 
+   system->cycles *= intel_priv->crossts_adj;
+
return 0;
 }
 
+static void intel_mgbe_pse_crossts_adj(struct intel_priv_data *intel_priv,
+  int base)
+{
+   if (boot_cpu_has(X86_FEATURE_ART)) {
+   unsigned int art_freq;
+
+   /* On systems that support ART, ART frequency can be obtained
+* from ECX register of CPUID leaf (0x15).
+*/
+   art_freq = cpuid_ecx(ART_CPUID_LEAF);
+   do_div(art_freq, base);
+   intel_priv->crossts_adj = art_freq;
+   }
+}
+
 static void common_default_data(struct plat_stmmacenet_data *plat)
 {
plat->clk_csr = 2;  /* clk_csr_i = 20-35MHz & MDC = clk_csr_i/16 */
@@ -551,6 +573,8 @@ static int ehl_pse0_common_data(struct pci_dev *pdev,
plat->bus_id = 2;
plat->addr64 = 32;
 
+   intel_mgbe_pse_crossts_adj(intel_priv, EHL_PSE_ART_MHZ);
+
return ehl_common_data(pdev, plat);
 }
 
@@ -587,6 +611,8 @@ static int ehl_pse1_common_data(struct pci_dev *pdev,
plat->bus_id = 3;
plat->addr64 = 32;
 
+   intel_mgbe_pse_crossts_adj(intel_priv, EHL_PSE_ART_MHZ);
+
return ehl_common_data(pdev, plat);
 }
 
@@ -913,6 +939,7 @@ static int intel_eth_pci_probe(struct pci_dev *pdev,
 
plat->bsp_priv = intel_priv;
intel_priv->mdio_adhoc_addr = INTEL_MGBE_ADHOC_ADDR;
+   intel_priv->crossts_adj = 1;
 
/* Initialize all MSI vectors to invalid so that it can be set
 * according to platform data settings below.
-- 
2.25.1



Re: [PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking

2021-03-29 Thread Wanpeng Li
On Tue, 30 Mar 2021 at 01:15, Sean Christopherson  wrote:
>
> +Thomas
>
> On Mon, Mar 29, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li 
> >
> > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > reported that the guest time remains 0 when running a while true
> > loop in the guest.
> >
> > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > tick-based time accouting when the ticks that happen after IRQs are
> > disabled are incorrectly accounted to the host/system time. This is
> > because we exit the guest state too early.
> >
> > vtime-based time accounting is tied to context tracking, keep the
> > guest_exit_irqoff() around vmexit code when both vtime-based time
> > accounting and specific cpu is context tracking mode active.
> > Otherwise, leave guest_exit_irqoff() after handle_exit_irqoff()
> > and explicit IRQ window for tick-based time accouting.
> >
> > Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
> > Cc: Sean Christopherson 
> > Signed-off-by: Wanpeng Li 
> > ---
> >  arch/x86/kvm/svm/svm.c | 3 ++-
> >  arch/x86/kvm/vmx/vmx.c | 3 ++-
> >  arch/x86/kvm/x86.c | 2 ++
> >  3 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 58a45bb..55fb5ce 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -3812,7 +3812,8 @@ static noinstr void svm_vcpu_enter_exit(struct 
> > kvm_vcpu *vcpu,
> >* into world and some more.
> >*/
> >   lockdep_hardirqs_off(CALLER_ADDR0);
> > - guest_exit_irqoff();
> > + if (vtime_accounting_enabled_this_cpu())
> > + guest_exit_irqoff();
> >
> >   instrumentation_begin();
> >   trace_hardirqs_off_finish();
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 32cf828..85695b3 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct 
> > kvm_vcpu *vcpu,
> >* into world and some more.
> >*/
> >   lockdep_hardirqs_off(CALLER_ADDR0);
> > - guest_exit_irqoff();
> > + if (vtime_accounting_enabled_this_cpu())
> > + guest_exit_irqoff();
>
> This looks ok, as CONFIG_CONTEXT_TRACKING and CONFIG_VIRT_CPU_ACCOUNTING_GEN 
> are
> selected by CONFIG_NO_HZ_FULL=y, and can't be enabled independently, e.g. the
> rcu_user_exit() call won't be delayed because it will never be called in the
> !vtime case.  But it still feels wrong poking into those details, e.g. it'll
> be weird and/or wrong guest_exit_irqoff() gains stuff that isn't vtime 
> specific.

Could you elaborate what's the meaning of "it'll be weird and/or wrong
guest_exit_irqoff() gains stuff that isn't vtime specific."?

Wanpeng


Re: [PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking

2021-03-29 Thread Sean Christopherson
+Thomas

On Mon, Mar 29, 2021, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831 
> reported that the guest time remains 0 when running a while true 
> loop in the guest.
> 
> The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it 
> belongs") moves guest_exit_irqoff() close to vmexit breaks the 
> tick-based time accouting when the ticks that happen after IRQs are 
> disabled are incorrectly accounted to the host/system time. This is 
> because we exit the guest state too early.
> 
> vtime-based time accounting is tied to context tracking, keep the 
> guest_exit_irqoff() around vmexit code when both vtime-based time 
> accounting and specific cpu is context tracking mode active. 
> Otherwise, leave guest_exit_irqoff() after handle_exit_irqoff() 
> and explicit IRQ window for tick-based time accouting.
> 
> Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
> Cc: Sean Christopherson 
> Signed-off-by: Wanpeng Li 
> ---
>  arch/x86/kvm/svm/svm.c | 3 ++-
>  arch/x86/kvm/vmx/vmx.c | 3 ++-
>  arch/x86/kvm/x86.c | 2 ++
>  3 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 58a45bb..55fb5ce 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3812,7 +3812,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu 
> *vcpu,
>* into world and some more.
>*/
>   lockdep_hardirqs_off(CALLER_ADDR0);
> - guest_exit_irqoff();
> + if (vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();
>  
>   instrumentation_begin();
>   trace_hardirqs_off_finish();
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 32cf828..85695b3 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu 
> *vcpu,
>* into world and some more.
>*/
>   lockdep_hardirqs_off(CALLER_ADDR0);
> - guest_exit_irqoff();
> + if (vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();

This looks ok, as CONFIG_CONTEXT_TRACKING and CONFIG_VIRT_CPU_ACCOUNTING_GEN are
selected by CONFIG_NO_HZ_FULL=y, and can't be enabled independently, e.g. the
rcu_user_exit() call won't be delayed because it will never be called in the
!vtime case.  But it still feels wrong poking into those details, e.g. it'll
be weird and/or wrong guest_exit_irqoff() gains stuff that isn't vtime specific.
Maybe that will never happen though?  And of course, my hack alternative also
pokes into the details[*].

Thomas, do you have an input on the least awful way to handle this?  My horrible
hack was to force PF_VCPU around the window where KVM handles IRQs after guest
exit.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d9f931c63293..6ddf341cd755 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9118,6 +9118,13 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu->mode = OUTSIDE_GUEST_MODE;
smp_wmb();

+   /*
+* Temporarily pretend this task is running a vCPU when potentially
+* processing an IRQ exit, including the below opening of an IRQ
+* window.  Tick-based accounting of guest time relies on PF_VCPU
+* being set when the tick IRQ handler runs.
+*/
+   current->flags |= PF_VCPU;
static_call(kvm_x86_handle_exit_irqoff)(vcpu);

/*
@@ -9132,6 +9139,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
++vcpu->stat.exits;
local_irq_disable();
kvm_after_interrupt(vcpu);
+   current->flags &= ~PF_VCPU;

if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;

[*]https://lkml.kernel.org/r/20210206004218.312023-1-sea...@google.com

>   instrumentation_begin();
>   trace_hardirqs_off_finish();
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe806e8..234c8b3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9185,6 +9185,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>   ++vcpu->stat.exits;
>   local_irq_disable();
>   kvm_after_interrupt(vcpu);
> + if (!vtime_accounting_enabled_this_cpu())
> + guest_exit_irqoff();
>  
>   if (lapic_in_kernel(vcpu)) {
>   s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
> -- 
> 2.7.4
> 


Re: [PATCH v5 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-29 Thread Jonathan Cameron
On Mon, 29 Mar 2021 15:06:20 +0200
Oleksij Rempel  wrote:

> On Mon, Mar 29, 2021 at 11:25:32AM +0100, Jonathan Cameron wrote:
> > On Mon, 29 Mar 2021 09:31:29 +0200
> > Oleksij Rempel  wrote:
> >   
> > > Settling time and over sampling is a typical challenge for different IIO 
> > > ADC
> > > devices. So, introduce channel specific settling-time-us and 
> > > oversampling-ratio
> > > properties to cover this use case.
> > > 
> > > Signed-off-by: Oleksij Rempel 
> > > ---
> > >  Documentation/devicetree/bindings/iio/adc/adc.yaml | 8 
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
> > > b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > > index 912a7635edc4..d5bc86d2a2af 100644
> > > --- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > > +++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > > @@ -39,4 +39,12 @@ properties:
> > >    The first value specifies the positive input pin, the second
> > >specifies the negative input pin.
> > >  
> > > +  settling-time-us:
> > > +description:
> > > +  Time between enabling the channel and firs stable readings.  
> > 
> > first  
> 
> ack
> 
> > > +
> > > +  oversampling-ratio:
> > > +$ref: /schemas/types.yaml#/definitions/uint32
> > > +description: Number of data samples which are averaged for each 
> > > read.  
> > 
> > I think I've asked about this in previous reviews, but I want a clear 
> > statement
> > of why you think this property is a feature of the 'board' (and hence 
> > should be
> > in device tree) rather than setting sensible defaults and leaving any 
> > control
> > to userspace?  
> 
> yes, my reply was:

Ah. I missed it somewhere along the way, thanks for repeating here.

> > Oversampling is used as replacement of or addition to the low-pass filter. 
> > The
> > filter can be implemented on board, but it will change settling time
> > characteristic. Since low-pass filter is board specific characteristic, this
> > property belongs in device tree as well.  
> 
> I could imagine that this values can be overwritten from user space for
> diagnostic, but we need some working default values. 

Hmm. So low pass filters are interesting whether they are actually a 
characteristic
of the board (obviously they are if they are resistors/ caps etc on the board),
or of the application. Some applications want noisy messy data,
others not so much. What filter you need to achieve a specific noise level on
a given board is indeed a characteristic of the board.  However, what that noise
level is (which actually drives the decision) is not a board characteristic.
If we have a configurable filter, then that can be argued to be a policy 
decision
and hence userspace, not DT.

> 
> Should I integrate this comment in to the yaml?

Definitely.  Whilst I'm not that keen on this one, you have made a reasonable 
argument
that it is 'sort of' a board characteristic, so I can live with that as long as
it is there.   Perhaps the slightly amended version of the above.

"Oversampling is used as replacement of or addition to the low-pass filter.
In some cases, the desired filtering characteristics are a function the
device design and can interact with other characteristics such as
settling time."

Jonathan


> 
> Regards,
> Oleksij



Re: [PATCH 01/10] platform/x86: toshiba_acpi: bind life-time of toshiba_acpi_dev to parent

2021-03-29 Thread Jonathan Cameron
On Wed, 24 Mar 2021 14:55:39 +0200
Alexandru Ardelean  wrote:

> The 'toshiba_acpi_dev' object is allocated first and free'd last. We can
> bind it's life-time to the parent ACPI device object. This is a first step
> in using more device-managed allocated functions for this.
> 
> The main intent is to try to convert the IIO framework to export only
> device-managed functions (i.e. devm_iio_device_alloc() and
> devm_iio_device_register()). It's still not 100% sure that this is
> possible, but for now, this is the process of taking it slowly in that
> direction.
> 
> Signed-off-by: Alexandru Ardelean 

Might just be me, but naming anything dev that isn't a struct device *
is downright confusing?




> ---
>  drivers/platform/x86/toshiba_acpi.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/platform/x86/toshiba_acpi.c 
> b/drivers/platform/x86/toshiba_acpi.c
> index fa7232ad8c39..6d298810b7bf 100644
> --- a/drivers/platform/x86/toshiba_acpi.c
> +++ b/drivers/platform/x86/toshiba_acpi.c
> @@ -2998,8 +2998,6 @@ static int toshiba_acpi_remove(struct acpi_device 
> *acpi_dev)
>   if (toshiba_acpi)
>   toshiba_acpi = NULL;
>  
> - kfree(dev);
> -
>   return 0;
>  }
>  
> @@ -3016,6 +3014,7 @@ static const char *find_hci_method(acpi_handle handle)
>  
>  static int toshiba_acpi_add(struct acpi_device *acpi_dev)
>  {
> + struct device *parent = _dev->dev;
>   struct toshiba_acpi_dev *dev;
>   const char *hci_method;
>   u32 dummy;
> @@ -3033,7 +3032,7 @@ static int toshiba_acpi_add(struct acpi_device 
> *acpi_dev)
>   return -ENODEV;
>   }
>  
> - dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> + dev = devm_kzalloc(parent, sizeof(*dev), GFP_KERNEL);
>   if (!dev)
>   return -ENOMEM;
>   dev->acpi_dev = acpi_dev;
> @@ -3045,7 +3044,6 @@ static int toshiba_acpi_add(struct acpi_device 
> *acpi_dev)
>   ret = misc_register(>miscdev);
>   if (ret) {
>   pr_err("Failed to register miscdevice\n");
> - kfree(dev);
>   return ret;
>   }
>  



Re: [PATCH v5 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-29 Thread Oleksij Rempel
On Mon, Mar 29, 2021 at 11:25:32AM +0100, Jonathan Cameron wrote:
> On Mon, 29 Mar 2021 09:31:29 +0200
> Oleksij Rempel  wrote:
> 
> > Settling time and over sampling is a typical challenge for different IIO ADC
> > devices. So, introduce channel specific settling-time-us and 
> > oversampling-ratio
> > properties to cover this use case.
> > 
> > Signed-off-by: Oleksij Rempel 
> > ---
> >  Documentation/devicetree/bindings/iio/adc/adc.yaml | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
> > b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > index 912a7635edc4..d5bc86d2a2af 100644
> > --- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > +++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> > @@ -39,4 +39,12 @@ properties:
> >The first value specifies the positive input pin, the second
> >specifies the negative input pin.
> >  
> > +  settling-time-us:
> > +description:
> > +  Time between enabling the channel and firs stable readings.
> 
> first

ack

> > +
> > +  oversampling-ratio:
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +description: Number of data samples which are averaged for each read.
> 
> I think I've asked about this in previous reviews, but I want a clear 
> statement
> of why you think this property is a feature of the 'board' (and hence should 
> be
> in device tree) rather than setting sensible defaults and leaving any control
> to userspace?

yes, my reply was:
> Oversampling is used as replacement of or addition to the low-pass filter. The
> filter can be implemented on board, but it will change settling time
> characteristic. Since low-pass filter is board specific characteristic, this
> property belongs in device tree as well.

I could imagine that this values can be overwritten from user space for
diagnostic, but we need some working default values. 

Should I integrate this comment in to the yaml?

Regards,
Oleksij
-- 
Pengutronix e.K.   | |
Steuerwalder Str. 21   | http://www.pengutronix.de/  |
31137 Hildesheim, Germany  | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


Re: [PATCH v5 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-29 Thread Jonathan Cameron
On Mon, 29 Mar 2021 09:31:29 +0200
Oleksij Rempel  wrote:

> Settling time and over sampling is a typical challenge for different IIO ADC
> devices. So, introduce channel specific settling-time-us and 
> oversampling-ratio
> properties to cover this use case.
> 
> Signed-off-by: Oleksij Rempel 
> ---
>  Documentation/devicetree/bindings/iio/adc/adc.yaml | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
> b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> index 912a7635edc4..d5bc86d2a2af 100644
> --- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
> +++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> @@ -39,4 +39,12 @@ properties:
>The first value specifies the positive input pin, the second
>    specifies the negative input pin.
>  
> +  settling-time-us:
> +description:
> +  Time between enabling the channel and firs stable readings.

first

> +
> +  oversampling-ratio:
> +$ref: /schemas/types.yaml#/definitions/uint32
> +description: Number of data samples which are averaged for each read.

I think I've asked about this in previous reviews, but I want a clear statement
of why you think this property is a feature of the 'board' (and hence should be
in device tree) rather than setting sensible defaults and leaving any control
to userspace?

Jonathan

> +
>  additionalProperties: true



[PATCH] KVM: X86: Properly account for guest CPU time when considering context tracking

2021-03-29 Thread Wanpeng Li
From: Wanpeng Li 

The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831 
reported that the guest time remains 0 when running a while true 
loop in the guest.

The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it 
belongs") moves guest_exit_irqoff() close to vmexit breaks the 
tick-based time accouting when the ticks that happen after IRQs are 
disabled are incorrectly accounted to the host/system time. This is 
because we exit the guest state too early.

vtime-based time accounting is tied to context tracking, keep the 
guest_exit_irqoff() around vmexit code when both vtime-based time 
accounting and specific cpu is context tracking mode active. 
Otherwise, leave guest_exit_irqoff() after handle_exit_irqoff() 
and explicit IRQ window for tick-based time accouting.

Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
Cc: Sean Christopherson 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/svm/svm.c | 3 ++-
 arch/x86/kvm/vmx/vmx.c | 3 ++-
 arch/x86/kvm/x86.c | 2 ++
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58a45bb..55fb5ce 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3812,7 +3812,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu 
*vcpu,
 * into world and some more.
 */
lockdep_hardirqs_off(CALLER_ADDR0);
-   guest_exit_irqoff();
+   if (vtime_accounting_enabled_this_cpu())
+   guest_exit_irqoff();
 
instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 32cf828..85695b3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6689,7 +6689,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu 
*vcpu,
 * into world and some more.
 */
lockdep_hardirqs_off(CALLER_ADDR0);
-   guest_exit_irqoff();
+   if (vtime_accounting_enabled_this_cpu())
+   guest_exit_irqoff();
 
instrumentation_begin();
trace_hardirqs_off_finish();
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe806e8..234c8b3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9185,6 +9185,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
++vcpu->stat.exits;
local_irq_disable();
kvm_after_interrupt(vcpu);
+   if (!vtime_accounting_enabled_this_cpu())
+   guest_exit_irqoff();
 
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
-- 
2.7.4



[PATCH v5 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-29 Thread Oleksij Rempel
Settling time and over sampling is a typical challenge for different IIO ADC
devices. So, introduce channel specific settling-time-us and oversampling-ratio
properties to cover this use case.

Signed-off-by: Oleksij Rempel 
---
 Documentation/devicetree/bindings/iio/adc/adc.yaml | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
b/Documentation/devicetree/bindings/iio/adc/adc.yaml
index 912a7635edc4..d5bc86d2a2af 100644
--- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
@@ -39,4 +39,12 @@ properties:
   The first value specifies the positive input pin, the second
   specifies the negative input pin.
 
+  settling-time-us:
+description:
+  Time between enabling the channel and firs stable readings.
+
+  oversampling-ratio:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: Number of data samples which are averaged for each read.
+
 additionalProperties: true
-- 
2.29.2



[PATCH v3 25/27] perf tests: Support 'Convert perf time to TSC' test for hybrid

2021-03-29 Thread Jin Yao
Since for "cycles:u' on hybrid platform, it creates two "cycles".
So the second evsel in evlist also needs initialization.

With this patch,

  # ./perf test 71
  71: Convert perf time to TSC: Ok

Signed-off-by: Jin Yao 
---
v3:
 - No functional change.

 tools/perf/tests/perf-time-to-tsc.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/tests/perf-time-to-tsc.c 
b/tools/perf/tests/perf-time-to-tsc.c
index 680c3cffb128..72f268c6cc5d 100644
--- a/tools/perf/tests/perf-time-to-tsc.c
+++ b/tools/perf/tests/perf-time-to-tsc.c
@@ -20,6 +20,7 @@
 #include "tsc.h"
 #include "mmap.h"
 #include "tests.h"
+#include "pmu.h"
 
 #define CHECK__(x) {   \
while ((x) < 0) {   \
@@ -66,6 +67,10 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, 
int subtest __maybe
u64 test_tsc, comm1_tsc, comm2_tsc;
u64 test_time, comm1_time = 0, comm2_time = 0;
struct mmap *md;
+   bool hybrid = false;
+
+   if (perf_pmu__has_hybrid())
+   hybrid = true;
 
threads = thread_map__new(-1, getpid(), UINT_MAX);
CHECK_NOT_NULL__(threads);
@@ -88,6 +93,17 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, 
int subtest __maybe
evsel->core.attr.disabled = 1;
evsel->core.attr.enable_on_exec = 0;
 
+   /*
+* For hybrid "cycles:u", it creates two events.
+* Init the second evsel here.
+*/
+   if (hybrid) {
+   evsel = evsel__next(evsel);
+   evsel->core.attr.comm = 1;
+   evsel->core.attr.disabled = 1;
+   evsel->core.attr.enable_on_exec = 0;
+   }
+
CHECK__(evlist__open(evlist));
 
CHECK__(evlist__mmap(evlist, UINT_MAX));
-- 
2.17.1



[PATCH v2 4/6] sched: introduce task block time in schedstats

2021-03-27 Thread Yafang Shao
Currently in schedstats we have sum_sleep_runtime and iowait_sum, but
there's no metric to show how long the task is in D state.  Once a task in
D state, it means the task is blocked in the kernel, for example the
task may be waiting for a mutex. The D state is more frequent than
iowait, and it is more critital than S state. So it is worth to add a
metric to measure it.

Signed-off-by: Yafang Shao 
---
 include/linux/sched.h | 2 ++
 kernel/sched/debug.c  | 6 --
 kernel/sched/stats.c  | 1 +
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b687bb38897b..2b885481b8bf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -428,6 +428,8 @@ struct sched_statistics {
 
u64 block_start;
u64 block_max;
+   s64 sum_block_runtime;
+
u64 exec_max;
u64 slice_max;
 
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index d1bc616936d9..0995412dd3c0 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -499,10 +499,11 @@ print_task(struct seq_file *m, struct rq *rq, struct 
task_struct *p)
(long long)(p->nvcsw + p->nivcsw),
p->prio);
 
-   SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld",
+   SEQ_printf(m, "%9lld.%06ld %9lld.%06ld %9lld.%06ld %9lld.%06ld",
SPLIT_NS(schedstat_val_or_zero(p->stats.wait_sum)),
SPLIT_NS(p->se.sum_exec_runtime),
-   SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime)));
+   SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime)),
+   SPLIT_NS(schedstat_val_or_zero(p->stats.sum_block_runtime)));
 
 #ifdef CONFIG_NUMA_BALANCING
SEQ_printf(m, " %d %d", task_node(p), task_numa_group_id(p));
@@ -941,6 +942,7 @@ void proc_sched_show_task(struct task_struct *p, struct 
pid_namespace *ns,
u64 avg_atom, avg_per_cpu;
 
PN_SCHEDSTAT(stats.sum_sleep_runtime);
+   PN_SCHEDSTAT(stats.sum_block_runtime);
PN_SCHEDSTAT(stats.wait_start);
PN_SCHEDSTAT(stats.sleep_start);
PN_SCHEDSTAT(stats.block_start);
diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index b2542f4d3192..21fae41c06f5 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -82,6 +82,7 @@ void __update_stats_enqueue_sleeper(struct rq *rq, struct 
task_struct *p,
 
__schedstat_set(stats->block_start, 0);
__schedstat_add(stats->sum_sleep_runtime, delta);
+   __schedstat_add(stats->sum_block_runtime, delta);
 
if (p) {
if (p->in_iowait) {
-- 
2.18.2



Re: [PATCH v4 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-26 Thread Rob Herring
On Mon, Mar 22, 2021 at 04:06:06PM +0100, Oleksij Rempel wrote:
> Settling time and over sampling is a typical challenge for different IIO ADC
> devices. So, introduce channel specific settling-time-us and 
> oversampling-ratio
> properties to cover this use case.
> 
> Signed-off-by: Oleksij Rempel 
> ---
>  Documentation/devicetree/bindings/iio/adc/adc.yaml | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
> b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> index 912a7635edc4..66fd4b45f097 100644
> --- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
> +++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
> @@ -39,4 +39,13 @@ properties:
>The first value specifies the positive input pin, the second
>    specifies the negative input pin.
>  
> +  settling-time-us:
> +$ref: /schemas/types.yaml#/definitions/uint32

Don't need a type for properties with a standard unit suffix.

> +description:
> +  Time between enabling the channel and firs stable readings.
> +
> +  oversampling-ratio:
> +$ref: /schemas/types.yaml#/definitions/uint32
> +description: Number of data samples which are averaged for each read.
> +
>  additionalProperties: true
> -- 
> 2.29.2
> 


[PATCH v2 06/20] drm/dp: Clarify DP AUX registration time

2021-03-26 Thread Lyude Paul
The docs we had for drm_dp_aux_init() and drm_dp_aux_register() were mostly
correct, except for the fact that they made the assumption that all AUX
devices were grandchildren of their respective DRM devices. This is the
case for most normal GPUs, but is almost never the case with SoCs and
display bridges. So, let's fix this documentation to clarify when the right
time to use drm_dp_aux_init() or drm_dp_aux_register() is.

Signed-off-by: Lyude Paul 
---
 drivers/gpu/drm/drm_dp_helper.c | 44 +++--
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index eedbb48815b7..3fa858b9691c 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -1728,10 +1728,17 @@ EXPORT_SYMBOL(drm_dp_remote_aux_init);
  * drm_dp_aux_init() - minimally initialise an aux channel
  * @aux: DisplayPort AUX channel
  *
- * If you need to use the drm_dp_aux's i2c adapter prior to registering it
- * with the outside world, call drm_dp_aux_init() first. You must still
- * call drm_dp_aux_register() once the connector has been registered to
- * allow userspace access to the auxiliary DP channel.
+ * If you need to use the drm_dp_aux's i2c adapter prior to registering it with
+ * the outside world, call drm_dp_aux_init() first. For drivers which are
+ * grandparents to their AUX adapters (e.g. the AUX adapter is parented by a
+ * _connector), you must still call drm_dp_aux_register() once the 
connector
+ * has been registered to allow userspace access to the auxiliary DP channel.
+ * Likewise, for such drivers you should also assign _dp_aux.drm_dev as
+ * early as possible so that the _device that corresponds to the AUX 
adapter
+ * may be mentioned in debugging output from the DRM DP helpers.
+ *
+ * For devices which use a separate platform device for their AUX adapters, 
this
+ * may be called as early as required by the driver.
  */
 void drm_dp_aux_init(struct drm_dp_aux *aux)
 {
@@ -1751,15 +1758,26 @@ EXPORT_SYMBOL(drm_dp_aux_init);
  * drm_dp_aux_register() - initialise and register aux channel
  * @aux: DisplayPort AUX channel
  *
- * Automatically calls drm_dp_aux_init() if this hasn't been done yet.
- * This should only be called when the underlying  drm_connector is
- * initialiazed already. Therefore the best place to call this is from
- * _connector_funcs.late_register. Not that drivers which don't follow this
- * will Oops when CONFIG_DRM_DP_AUX_CHARDEV is enabled.
- *
- * Drivers which need to use the aux channel before that point (e.g. at driver
- * load time, before drm_dev_register() has been called) need to call
- * drm_dp_aux_init().
+ * Automatically calls drm_dp_aux_init() if this hasn't been done yet. This
+ * should only be called once the parent of @aux, _dp_aux.dev, is
+ * initialized. For devices which are grandparents of their AUX channels,
+ * _dp_aux.dev will typically be the _connector  which
+ * corresponds to @aux. For these devices, it's advised to call
+ * drm_dp_aux_register() in _connector_funcs.late_register, and likewise to
+ * call drm_dp_aux_unregister() in _connector_funcs.early_unregister.
+ * Functions which don't follow this will likely Oops when
+ * %CONFIG_DRM_DP_AUX_CHARDEV is enabled.
+ *
+ * For devices where the AUX channel is a device that exists independently of
+ * the _device that uses it, such as SoCs and bridge devices, it is
+ * recommended to call drm_dp_aux_register() after a _device has been
+ * assigned to _dp_aux.drm_dev, and likewise to call 
drm_dp_aux_unregister()
+ * once the _device should no longer be associated with the AUX channel
+ * (e.g. on bridge detach).
+ *
+ * Drivers which need to use the aux channel before either of the two points
+ * mentioned above need to call drm_dp_aux_init() in order to use the AUX
+ * channel before registration.
  *
  * Returns 0 on success or a negative error code on failure.
  */
-- 
2.30.2



Re: [PATCH] dt-bindings: i2c: Add device clock-stretch time via dts

2021-03-25 Thread Qii Wang
On Wed, 2021-03-24 at 11:12 -0600, Rob Herring wrote:
> On Sat, Mar 13, 2021 at 04:07:09PM +0800, qii.w...@mediatek.com wrote:
> > From: Qii Wang 
> > 
> > tSU,STA/tHD,STA/tSU,STOP maybe out of spec due to device
> > clock-stretching or circuit loss, we could get device
> > clock-stretch time from dts to adjust these parameters
> > to meet the spec via EXT_CONF register.
> > 
> > Signed-off-by: Qii Wang 
> > ---
> >  Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt 
> > b/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> > index 7f0194f..97f66f0 100644
> > --- a/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> > +++ b/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> > @@ -32,6 +32,7 @@ Optional properties:
> >- mediatek,have-pmic: platform can control i2c form special pmic side.
> >  Only mt6589 and mt8135 support this feature.
> >- mediatek,use-push-pull: IO config use push-pull mode.
> > +  - clock-stretch-ns: Slave device clock-stretch time.
> 
> Should be a common I2C property?
> 

Wolfram Sang will look at this next and think about it. I hope it would
be a common I2C property.

> >  
> >  Example:
> >  
> > -- 
> > 1.9.1
> > 



[PATCH 2/3] ASoC:codec:max98373: Added 30ms turn on/off time delay

2021-03-24 Thread Ryan Lee
Amp requires 10 ~ 30ms for the power ON and OFF.
Added 30ms delay for stability.

Signed-off-by: Ryan Lee 
---
 sound/soc/codecs/max98373.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c
index 746c829312b8..1346a98ce8a1 100644
--- a/sound/soc/codecs/max98373.c
+++ b/sound/soc/codecs/max98373.c
@@ -28,11 +28,13 @@ static int max98373_dac_event(struct snd_soc_dapm_widget *w,
regmap_update_bits(max98373->regmap,
MAX98373_R20FF_GLOBAL_SHDN,
MAX98373_GLOBAL_EN_MASK, 1);
+   usleep_range(3, 31000);
break;
case SND_SOC_DAPM_POST_PMD:
regmap_update_bits(max98373->regmap,
MAX98373_R20FF_GLOBAL_SHDN,
MAX98373_GLOBAL_EN_MASK, 0);
+   usleep_range(3, 31000);
max98373->tdm_mode = false;
break;
default:
-- 
2.17.1



Re: [PATCH] dt-bindings: i2c: Add device clock-stretch time via dts

2021-03-24 Thread Rob Herring
On Sat, Mar 13, 2021 at 04:07:09PM +0800, qii.w...@mediatek.com wrote:
> From: Qii Wang 
> 
> tSU,STA/tHD,STA/tSU,STOP maybe out of spec due to device
> clock-stretching or circuit loss, we could get device
> clock-stretch time from dts to adjust these parameters
> to meet the spec via EXT_CONF register.
> 
> Signed-off-by: Qii Wang 
> ---
>  Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt 
> b/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> index 7f0194f..97f66f0 100644
> --- a/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> +++ b/Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
> @@ -32,6 +32,7 @@ Optional properties:
>- mediatek,have-pmic: platform can control i2c form special pmic side.
>  Only mt6589 and mt8135 support this feature.
>- mediatek,use-push-pull: IO config use push-pull mode.
> +  - clock-stretch-ns: Slave device clock-stretch time.

Should be a common I2C property?

>  
>  Example:
>  
> -- 
> 1.9.1
> 


[PATCH 01/10] platform/x86: toshiba_acpi: bind life-time of toshiba_acpi_dev to parent

2021-03-24 Thread Alexandru Ardelean
The 'toshiba_acpi_dev' object is allocated first and free'd last. We can
bind it's life-time to the parent ACPI device object. This is a first step
in using more device-managed allocated functions for this.

The main intent is to try to convert the IIO framework to export only
device-managed functions (i.e. devm_iio_device_alloc() and
devm_iio_device_register()). It's still not 100% sure that this is
possible, but for now, this is the process of taking it slowly in that
direction.

Signed-off-by: Alexandru Ardelean 
---
 drivers/platform/x86/toshiba_acpi.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index fa7232ad8c39..6d298810b7bf 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -2998,8 +2998,6 @@ static int toshiba_acpi_remove(struct acpi_device 
*acpi_dev)
if (toshiba_acpi)
toshiba_acpi = NULL;
 
-   kfree(dev);
-
return 0;
 }
 
@@ -3016,6 +3014,7 @@ static const char *find_hci_method(acpi_handle handle)
 
 static int toshiba_acpi_add(struct acpi_device *acpi_dev)
 {
+   struct device *parent = _dev->dev;
struct toshiba_acpi_dev *dev;
const char *hci_method;
u32 dummy;
@@ -3033,7 +3032,7 @@ static int toshiba_acpi_add(struct acpi_device *acpi_dev)
return -ENODEV;
}
 
-   dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+   dev = devm_kzalloc(parent, sizeof(*dev), GFP_KERNEL);
if (!dev)
return -ENOMEM;
dev->acpi_dev = acpi_dev;
@@ -3045,7 +3044,6 @@ static int toshiba_acpi_add(struct acpi_device *acpi_dev)
ret = misc_register(>miscdev);
if (ret) {
pr_err("Failed to register miscdevice\n");
-   kfree(dev);
return ret;
}
 
-- 
2.30.2



Re: [PATCH v4 18/22] x86/fpu/amx: Define AMX state components and have it used for boot-time checks

2021-03-23 Thread Bae, Chang Seok
On Mar 20, 2021, at 14:31, Thomas Gleixner  wrote:
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> 
>> +static void check_xtile_data_against_struct(int size)
>> +{
>> +u32 max_palid, palid, state_size;
>> +u32 eax, ebx, ecx, edx;
>> +u16 max_tile;
>> +
>> +/*
>> + * Check the maximum palette id:
>> + *   eax: the highest numbered palette subleaf.
>> + */
>> +cpuid_count(TILE_CPUID, 0, _palid, , , );
>> +
>> +/*
>> + * Cross-check each tile size and find the maximum
>> + * number of supported tiles.
>> + */
>> +for (palid = 1, max_tile = 0; palid <= max_palid; palid++) {
>> +u16 tile_size, max;
>> +
>> +/*
>> + * Check the tile size info:
>> + *   eax[31:16]:  bytes per title
>> + *   ebx[31:16]:  the max names (or max number of tiles)
>> + */
>> +cpuid_count(TILE_CPUID, palid, , , , );
>> +tile_size = eax >> 16;
>> +max = ebx >> 16;
>> +
>> +if (WARN_ONCE(tile_size != sizeof(struct xtile_data),
>> +  "%s: struct is %zu bytes, cpu xtile %d bytes\n",
>> +  __stringify(XFEATURE_XTILE_DATA),
>> +  sizeof(struct xtile_data), tile_size))
>> +__xstate_dump_leaves();
>> +
>> +if (max > max_tile)
>> +max_tile = max;
>> +}
>> +
>> +state_size = sizeof(struct xtile_data) * max_tile;
>> +if (WARN_ONCE(size != state_size,
>> +  "%s: calculated size is %u bytes, cpu state %d bytes\n",
>> +  __stringify(XFEATURE_XTILE_DATA), state_size, size))
>> +__xstate_dump_leaves();
> 
> So we have 2 warnings which complain about inconsistent state and that's
> it? Why has this absolutely no consequences? We just keep stuff enabled
> and jug along, right?
> 
> Which one of the two states is correct? Why don't we just disable that
> muck and be done with it to play it safe?
> 
> Failing to execute some workload by saying NO due to inconsistency is
> far more useful than taking the chance of potential silent data
> corruption.

This change in fact follows the mainline code [1], where this type of warning
is emitted with such mismatch.

Yes, disabling the feature looks to be the right way. Or, perhaps, taking a
large one is an option when mismatched ?

At least, given the feedback, the mainline needs to be revised before applying
this. Correct me if you don’t think so.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n567

Thanks,
Chang

Re: [PATCH v4 net-next 02/11] net: bridge: add helper to retrieve the current ageing time

2021-03-23 Thread Nikolay Aleksandrov
On 23/03/2021 01:51, Vladimir Oltean wrote:
> From: Vladimir Oltean 
> 
> The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:
> 
> sysfs/ioctl/netlink
> -> br_set_ageing_time
>-> __set_ageing_time
> 
> therefore not at bridge port creation time, so:
> (a) switchdev drivers have to hardcode the initial value for the address
> ageing time, because they didn't get any notification
> (b) that hardcoded value can be out of sync, if the user changes the
> ageing time before enslaving the port to the bridge
> 
> We need a helper in the bridge, such that switchdev drivers can query
> the current value of the bridge ageing time when they start offloading
> it.
> 
> Signed-off-by: Vladimir Oltean 
> Reviewed-by: Florian Fainelli 
> Reviewed-by: Tobias Waldekranz 
> ---
>  include/linux/if_bridge.h |  6 ++
>  net/bridge/br_stp.c   | 13 +
>  2 files changed, 19 insertions(+)
> 

The patch is mostly fine, there are a few minor nits (const qualifiers). If 
there
is another version of the patch-set please add them, either way:

Acked-by: Nikolay Aleksandrov 

> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 920d3a02cc68..ebd16495459c 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -137,6 +137,7 @@ struct net_device *br_fdb_find_port(const struct 
> net_device *br_dev,
>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>  u8 br_port_get_stp_state(const struct net_device *dev);
> +clock_t br_get_ageing_time(struct net_device *br_dev);
>  #else
>  static inline struct net_device *
>  br_fdb_find_port(const struct net_device *br_dev,
> @@ -160,6 +161,11 @@ static inline u8 br_port_get_stp_state(const struct 
> net_device *dev)
>  {
>   return BR_STATE_DISABLED;
>  }
> +
> +static inline clock_t br_get_ageing_time(struct net_device *br_dev)

const

> +{
> + return 0;
> +}
>  #endif
>  
>  #endif
> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
> index 86b5e05d3f21..3dafb6143cff 100644
> --- a/net/bridge/br_stp.c
> +++ b/net/bridge/br_stp.c
> @@ -639,6 +639,19 @@ int br_set_ageing_time(struct net_bridge *br, clock_t 
> ageing_time)
>   return 0;
>  }
>  
> +clock_t br_get_ageing_time(struct net_device *br_dev)

const

> +{
> + struct net_bridge *br;

const

> +
> + if (!netif_is_bridge_master(br_dev))
> + return 0;
> +
> + br = netdev_priv(br_dev);
> +
> + return jiffies_to_clock_t(br->ageing_time);
> +}
> +EXPORT_SYMBOL_GPL(br_get_ageing_time);
> +
>  /* called under bridge lock */
>  void __br_set_topology_change(struct net_bridge *br, unsigned char val)
>  {
> 



[PATCH v4 net-next 08/11] net: dsa: inherit the actual bridge port flags at join time

2021-03-22 Thread Vladimir Oltean
From: Vladimir Oltean 

DSA currently assumes that the bridge port starts off with this
constellation of bridge port flags:

- learning on
- unicast flooding on
- multicast flooding on
- broadcast flooding on

just by virtue of code copy-pasta from the bridge layer (new_nbp).
This was a simple enough strategy thus far, because the 'bridge join'
moment always coincided with the 'bridge port creation' moment.

But with sandwiched interfaces, such as:

 br0
  |
bond0
  |
 swp0

it may happen that the user has had time to change the bridge port flags
of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
assume that the bridge port flags are those determined by new_nbp, when
in fact this can happen:

ip link add br0 type bridge
ip link add bond0 type bond
ip link set bond0 master br0
ip link set bond0 type bridge_slave learning off
ip link set swp0 master br0

Now swp0 has learning enabled, bond0 has learning disabled. Not nice.

Fix this by "dumpster diving" through the actual bridge port flags with
br_port_flag_is_set, at bridge join time.

We use this opportunity to split dsa_port_change_brport_flags into two
distinct functions called dsa_port_inherit_brport_flags and
dsa_port_clear_brport_flags, now that the implementation for the two
cases is no longer similar. This patch also creates two functions called
dsa_port_switchdev_sync and dsa_port_switchdev_unsync which collect what
we have so far, even if that's asymmetrical. More is going to be added
in the next patch.

Signed-off-by: Vladimir Oltean 
---
 net/dsa/port.c | 123 -
 1 file changed, 82 insertions(+), 41 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index fcbe5b1545b8..c712bf3da0a0 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -122,28 +122,84 @@ void dsa_port_disable(struct dsa_port *dp)
rtnl_unlock();
 }
 
-static void dsa_port_change_brport_flags(struct dsa_port *dp,
-bool bridge_offload)
+static int dsa_port_inherit_brport_flags(struct dsa_port *dp,
+struct netlink_ext_ack *extack)
 {
-   struct switchdev_brport_flags flags;
-   int flag;
+   const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
+  BR_BCAST_FLOOD;
+   struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
+   int flag, err;
 
-   flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
-   if (bridge_offload)
-   flags.val = flags.mask;
-   else
-   flags.val = flags.mask & ~BR_LEARNING;
+   for_each_set_bit(flag, , 32) {
+   struct switchdev_brport_flags flags = {0};
+
+   flags.mask = BIT(flag);
 
-   for_each_set_bit(flag, , 32) {
-   struct switchdev_brport_flags tmp;
+   if (br_port_flag_is_set(brport_dev, BIT(flag)))
+   flags.val = BIT(flag);
+
+   err = dsa_port_bridge_flags(dp, flags, extack);
+   if (err && err != -EOPNOTSUPP)
+   return err;
+   }
 
-   tmp.val = flags.val & BIT(flag);
-   tmp.mask = BIT(flag);
+   return 0;
+}
 
-   dsa_port_bridge_flags(dp, tmp, NULL);
+static void dsa_port_clear_brport_flags(struct dsa_port *dp)
+{
+   const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+   const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
+  BR_BCAST_FLOOD;
+   int flag, err;
+
+   for_each_set_bit(flag, , 32) {
+   struct switchdev_brport_flags flags = {0};
+
+   flags.mask = BIT(flag);
+   flags.val = val & BIT(flag);
+
+   err = dsa_port_bridge_flags(dp, flags, NULL);
+   if (err && err != -EOPNOTSUPP)
+   dev_err(dp->ds->dev,
+   "failed to clear bridge port flag %lu: %pe\n",
+   flags.val, ERR_PTR(err));
}
 }
 
+static int dsa_port_switchdev_sync(struct dsa_port *dp,
+  struct netlink_ext_ack *extack)
+{
+   int err;
+
+   err = dsa_port_inherit_brport_flags(dp, extack);
+   if (err)
+   return err;
+
+   return 0;
+}
+
+static void dsa_port_switchdev_unsync(struct dsa_port *dp)
+{
+   /* Configure the port for standalone mode (no address learning,
+* flood everything).
+* The bridge only emits SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS events
+* when the user requests it through netlink or sysfs, but not
+* automatically at port join or leave, so we need to handle resetting
+* the brport flags ourselves. But we even prefer it that way, because
+* otherwise, some setups might never get the notification they need,
+   

[PATCH v4 net-next 02/11] net: bridge: add helper to retrieve the current ageing time

2021-03-22 Thread Vladimir Oltean
From: Vladimir Oltean 

The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:

sysfs/ioctl/netlink
-> br_set_ageing_time
   -> __set_ageing_time

therefore not at bridge port creation time, so:
(a) switchdev drivers have to hardcode the initial value for the address
ageing time, because they didn't get any notification
(b) that hardcoded value can be out of sync, if the user changes the
ageing time before enslaving the port to the bridge

We need a helper in the bridge, such that switchdev drivers can query
the current value of the bridge ageing time when they start offloading
it.

Signed-off-by: Vladimir Oltean 
Reviewed-by: Florian Fainelli 
Reviewed-by: Tobias Waldekranz 
---
 include/linux/if_bridge.h |  6 ++
 net/bridge/br_stp.c   | 13 +
 2 files changed, 19 insertions(+)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 920d3a02cc68..ebd16495459c 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -137,6 +137,7 @@ struct net_device *br_fdb_find_port(const struct net_device 
*br_dev,
 void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
 bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
 u8 br_port_get_stp_state(const struct net_device *dev);
+clock_t br_get_ageing_time(struct net_device *br_dev);
 #else
 static inline struct net_device *
 br_fdb_find_port(const struct net_device *br_dev,
@@ -160,6 +161,11 @@ static inline u8 br_port_get_stp_state(const struct 
net_device *dev)
 {
return BR_STATE_DISABLED;
 }
+
+static inline clock_t br_get_ageing_time(struct net_device *br_dev)
+{
+   return 0;
+}
 #endif
 
 #endif
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 86b5e05d3f21..3dafb6143cff 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -639,6 +639,19 @@ int br_set_ageing_time(struct net_bridge *br, clock_t 
ageing_time)
return 0;
 }
 
+clock_t br_get_ageing_time(struct net_device *br_dev)
+{
+   struct net_bridge *br;
+
+   if (!netif_is_bridge_master(br_dev))
+   return 0;
+
+   br = netdev_priv(br_dev);
+
+   return jiffies_to_clock_t(br->ageing_time);
+}
+EXPORT_SYMBOL_GPL(br_get_ageing_time);
+
 /* called under bridge lock */
 void __br_set_topology_change(struct net_bridge *br, unsigned char val)
 {
-- 
2.25.1



Re: [PATCH v3 net-next 07/12] net: dsa: sync ageing time when joining the bridge

2021-03-22 Thread Florian Fainelli



On 3/20/2021 3:34 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean 
> 
> The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:
> 
> sysfs/ioctl/netlink
> -> br_set_ageing_time
>-> __set_ageing_time
> 
> therefore not at bridge port creation time, so:
> (a) drivers had to hardcode the initial value for the address ageing time,
> because they didn't get any notification
> (b) that hardcoded value can be out of sync, if the user changes the
> ageing time before enslaving the port to the bridge
> 
> Signed-off-by: Vladimir Oltean 

Reviewed-by: Florian Fainelli 
-- 
Florian


[PATCH v4 1/3] dt-bindings:iio:adc: add generic settling-time-us and oversampling-ratio channel properties

2021-03-22 Thread Oleksij Rempel
Settling time and over sampling is a typical challenge for different IIO ADC
devices. So, introduce channel specific settling-time-us and oversampling-ratio
properties to cover this use case.

Signed-off-by: Oleksij Rempel 
---
 Documentation/devicetree/bindings/iio/adc/adc.yaml | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/iio/adc/adc.yaml 
b/Documentation/devicetree/bindings/iio/adc/adc.yaml
index 912a7635edc4..66fd4b45f097 100644
--- a/Documentation/devicetree/bindings/iio/adc/adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adc.yaml
@@ -39,4 +39,13 @@ properties:
   The first value specifies the positive input pin, the second
   specifies the negative input pin.
 
+  settling-time-us:
+$ref: /schemas/types.yaml#/definitions/uint32
+description:
+  Time between enabling the channel and firs stable readings.
+
+  oversampling-ratio:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: Number of data samples which are averaged for each read.
+
 additionalProperties: true
-- 
2.29.2



[PATCH 5.4 08/60] s390/vtime: fix increased steal time accounting

2021-03-22 Thread Greg Kroah-Hartman
From: Gerald Schaefer 

commit d54cb7d54877d529bc1e0e1f47a3dd082f73add3 upstream.

Commit 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
inadvertently changed the input value for account_steal_time() from
"cputime_to_nsecs(steal)" to just "steal", resulting in broken increased
steal time accounting.

Fix this by changing it back to "cputime_to_nsecs(steal)".

Fixes: 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
Cc:  # 5.1
Reported-by: Sabine Forkel 
Reviewed-by: Heiko Carstens 
Signed-off-by: Gerald Schaefer 
Signed-off-by: Heiko Carstens 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/s390/kernel/vtime.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -217,7 +217,7 @@ void vtime_flush(struct task_struct *tsk
avg_steal = S390_lowcore.avg_steal_timer / 2;
if ((s64) steal > 0) {
S390_lowcore.steal_timer = 0;
-   account_steal_time(steal);
+   account_steal_time(cputime_to_nsecs(steal));
avg_steal += steal;
}
S390_lowcore.avg_steal_timer = avg_steal;




[PATCH 5.10 012/157] s390/vtime: fix increased steal time accounting

2021-03-22 Thread Greg Kroah-Hartman
From: Gerald Schaefer 

commit d54cb7d54877d529bc1e0e1f47a3dd082f73add3 upstream.

Commit 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
inadvertently changed the input value for account_steal_time() from
"cputime_to_nsecs(steal)" to just "steal", resulting in broken increased
steal time accounting.

Fix this by changing it back to "cputime_to_nsecs(steal)".

Fixes: 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
Cc:  # 5.1
Reported-by: Sabine Forkel 
Reviewed-by: Heiko Carstens 
Signed-off-by: Gerald Schaefer 
Signed-off-by: Heiko Carstens 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/s390/kernel/vtime.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -217,7 +217,7 @@ void vtime_flush(struct task_struct *tsk
avg_steal = S390_lowcore.avg_steal_timer / 2;
if ((s64) steal > 0) {
S390_lowcore.steal_timer = 0;
-   account_steal_time(steal);
+   account_steal_time(cputime_to_nsecs(steal));
avg_steal += steal;
}
S390_lowcore.avg_steal_timer = avg_steal;




[PATCH 5.11 012/120] s390/vtime: fix increased steal time accounting

2021-03-22 Thread Greg Kroah-Hartman
From: Gerald Schaefer 

commit d54cb7d54877d529bc1e0e1f47a3dd082f73add3 upstream.

Commit 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
inadvertently changed the input value for account_steal_time() from
"cputime_to_nsecs(steal)" to just "steal", resulting in broken increased
steal time accounting.

Fix this by changing it back to "cputime_to_nsecs(steal)".

Fixes: 152e9b8676c6e ("s390/vtime: steal time exponential moving average")
Cc:  # 5.1
Reported-by: Sabine Forkel 
Reviewed-by: Heiko Carstens 
Signed-off-by: Gerald Schaefer 
Signed-off-by: Heiko Carstens 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/s390/kernel/vtime.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -217,7 +217,7 @@ void vtime_flush(struct task_struct *tsk
avg_steal = S390_lowcore.avg_steal_timer / 2;
if ((s64) steal > 0) {
S390_lowcore.steal_timer = 0;
-   account_steal_time(steal);
+   account_steal_time(cputime_to_nsecs(steal));
avg_steal += steal;
}
S390_lowcore.avg_steal_timer = avg_steal;




Re: [RFC PATCH v2 net-next 07/16] net: dsa: sync ageing time when joining the bridge

2021-03-22 Thread Tobias Waldekranz
On Fri, Mar 19, 2021 at 01:18, Vladimir Oltean  wrote:
> From: Vladimir Oltean 
>
> The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:
>
> sysfs/ioctl/netlink
> -> br_set_ageing_time
>-> __set_ageing_time
>
> therefore not at bridge port creation time, so:
> (a) drivers had to hardcode the initial value for the address ageing time,
> because they didn't get any notification
> (b) that hardcoded value can be out of sync, if the user changes the
> ageing time before enslaving the port to the bridge
>
> Signed-off-by: Vladimir Oltean 
> ---

Reviewed-by: Tobias Waldekranz 


Re: [PATCH v3 net-next 03/12] net: dsa: inherit the actual bridge port flags at join time

2021-03-20 Thread kernel test robot
Hi Vladimir,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Vladimir-Oltean/Better-support-for-sandwiched-LAGs-with-bridge-and-DSA/20210321-063842
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
d773b7957e4fd7b732a163df0e59d31ad4237302
config: arm64-randconfig-r021-20210321 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
14696baaf4c43fe53f738bc292bbe169eed93d5d)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/3aac17167e3de0aeaf5287f9d586725bdc7495a5
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Vladimir-Oltean/Better-support-for-sandwiched-LAGs-with-bridge-and-DSA/20210321-063842
git checkout 3aac17167e3de0aeaf5287f9d586725bdc7495a5
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> net/dsa/port.c:167:5: warning: format specifies type 'int' but the argument 
>> has type 'unsigned long' [-Wformat]
   flags.val, err, ERR_PTR(err));
   ^
   include/linux/dev_printk.h:112:32: note: expanded from macro 'dev_err'
   _dev_err(dev, dev_fmt(fmt), ##__VA_ARGS__)
 ~~~ ^~~
   1 warning generated.


vim +167 net/dsa/port.c

   148  
   149  static void dsa_port_clear_brport_flags(struct dsa_port *dp,
   150  struct netlink_ext_ack *extack)
   151  {
   152  const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | 
BR_BCAST_FLOOD;
   153  const unsigned long mask = BR_LEARNING | BR_FLOOD | 
BR_MCAST_FLOOD |
   154 BR_BCAST_FLOOD;
   155  int flag, err;
   156  
   157  for_each_set_bit(flag, , 32) {
   158  struct switchdev_brport_flags flags = {0};
   159  
   160  flags.mask = BIT(flag);
   161  flags.val = val & BIT(flag);
   162  
   163  err = dsa_port_bridge_flags(dp, flags, extack);
   164  if (err && err != -EOPNOTSUPP)
   165  dev_err(dp->ds->dev,
   166  "failed to clear bridge port flag %d: 
%d (%pe)\n",
 > 167  flags.val, err, ERR_PTR(err));
   168  }
   169  }
   170  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v3 net-next 03/12] net: dsa: inherit the actual bridge port flags at join time

2021-03-20 Thread kernel test robot
Hi Vladimir,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Vladimir-Oltean/Better-support-for-sandwiched-LAGs-with-bridge-and-DSA/20210321-063842
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
d773b7957e4fd7b732a163df0e59d31ad4237302
config: arm-mvebu_v5_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/3aac17167e3de0aeaf5287f9d586725bdc7495a5
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Vladimir-Oltean/Better-support-for-sandwiched-LAGs-with-bridge-and-DSA/20210321-063842
git checkout 3aac17167e3de0aeaf5287f9d586725bdc7495a5
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from include/linux/device.h:15,
from include/linux/dma-mapping.h:7,
from include/linux/skbuff.h:31,
from include/net/net_namespace.h:39,
from include/linux/netdevice.h:37,
from include/linux/if_bridge.h:12,
from net/dsa/port.c:9:
   net/dsa/port.c: In function 'dsa_port_clear_brport_flags':
>> net/dsa/port.c:166:5: warning: format '%d' expects argument of type 'int', 
>> but argument 3 has type 'long unsigned int' [-Wformat=]
 166 | "failed to clear bridge port flag %d: %d (%pe)\n",
 | ^
   include/linux/dev_printk.h:19:22: note: in definition of macro 'dev_fmt'
  19 | #define dev_fmt(fmt) fmt
 |  ^~~
   net/dsa/port.c:165:4: note: in expansion of macro 'dev_err'
 165 |dev_err(dp->ds->dev,
 |^~~
   net/dsa/port.c:166:40: note: format string is defined here
 166 | "failed to clear bridge port flag %d: %d (%pe)\n",
 |   ~^
 ||
 |int
 |   %ld


vim +166 net/dsa/port.c

   148  
   149  static void dsa_port_clear_brport_flags(struct dsa_port *dp,
   150  struct netlink_ext_ack *extack)
   151  {
   152  const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | 
BR_BCAST_FLOOD;
   153  const unsigned long mask = BR_LEARNING | BR_FLOOD | 
BR_MCAST_FLOOD |
   154 BR_BCAST_FLOOD;
   155  int flag, err;
   156  
   157  for_each_set_bit(flag, , 32) {
   158  struct switchdev_brport_flags flags = {0};
   159  
   160  flags.mask = BIT(flag);
   161  flags.val = val & BIT(flag);
   162  
   163  err = dsa_port_bridge_flags(dp, flags, extack);
   164  if (err && err != -EOPNOTSUPP)
   165  dev_err(dp->ds->dev,
 > 166  "failed to clear bridge port flag %d: 
 > %d (%pe)\n",
   167  flags.val, err, ERR_PTR(err));
   168  }
   169  }
   170  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH v3 net-next 07/12] net: dsa: sync ageing time when joining the bridge

2021-03-20 Thread Vladimir Oltean
From: Vladimir Oltean 

The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:

sysfs/ioctl/netlink
-> br_set_ageing_time
   -> __set_ageing_time

therefore not at bridge port creation time, so:
(a) drivers had to hardcode the initial value for the address ageing time,
because they didn't get any notification
(b) that hardcoded value can be out of sync, if the user changes the
ageing time before enslaving the port to the bridge

Signed-off-by: Vladimir Oltean 
---
Changes in v3:
None.

 include/linux/if_bridge.h |  6 ++
 net/bridge/br_stp.c   | 13 +
 net/dsa/port.c| 10 ++
 3 files changed, 29 insertions(+)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 920d3a02cc68..ebd16495459c 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -137,6 +137,7 @@ struct net_device *br_fdb_find_port(const struct net_device 
*br_dev,
 void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
 bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
 u8 br_port_get_stp_state(const struct net_device *dev);
+clock_t br_get_ageing_time(struct net_device *br_dev);
 #else
 static inline struct net_device *
 br_fdb_find_port(const struct net_device *br_dev,
@@ -160,6 +161,11 @@ static inline u8 br_port_get_stp_state(const struct 
net_device *dev)
 {
return BR_STATE_DISABLED;
 }
+
+static inline clock_t br_get_ageing_time(struct net_device *br_dev)
+{
+   return 0;
+}
 #endif
 
 #endif
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 86b5e05d3f21..3dafb6143cff 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -639,6 +639,19 @@ int br_set_ageing_time(struct net_bridge *br, clock_t 
ageing_time)
return 0;
 }
 
+clock_t br_get_ageing_time(struct net_device *br_dev)
+{
+   struct net_bridge *br;
+
+   if (!netif_is_bridge_master(br_dev))
+   return 0;
+
+   br = netdev_priv(br_dev);
+
+   return jiffies_to_clock_t(br->ageing_time);
+}
+EXPORT_SYMBOL_GPL(br_get_ageing_time);
+
 /* called under bridge lock */
 void __br_set_topology_change(struct net_bridge *br, unsigned char val)
 {
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 124f8bb21204..95e6f2861290 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -173,6 +173,7 @@ static int dsa_port_switchdev_sync(struct dsa_port *dp,
 {
struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
struct net_device *br = dp->bridge_dev;
+   clock_t ageing_time;
u8 stp_state;
int err;
 
@@ -193,6 +194,11 @@ static int dsa_port_switchdev_sync(struct dsa_port *dp,
if (err && err != -EOPNOTSUPP)
return err;
 
+   ageing_time = br_get_ageing_time(br);
+   err = dsa_port_ageing_time(dp, ageing_time);
+   if (err && err != -EOPNOTSUPP)
+   return err;
+
return 0;
 }
 
@@ -222,6 +228,10 @@ static void dsa_port_switchdev_unsync(struct dsa_port *dp)
 * allow this in standalone mode too.
 */
dsa_port_mrouter(dp->cpu_dp, true, NULL);
+
+   /* Ageing time may be global to the switch chip, so don't change it
+* here because we have no good reason (or value) to change it to.
+*/
 }
 
 int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br,
-- 
2.25.1



[PATCH v3 net-next 03/12] net: dsa: inherit the actual bridge port flags at join time

2021-03-20 Thread Vladimir Oltean
From: Vladimir Oltean 

DSA currently assumes that the bridge port starts off with this
constellation of bridge port flags:

- learning on
- unicast flooding on
- multicast flooding on
- broadcast flooding on

just by virtue of code copy-pasta from the bridge layer (new_nbp).
This was a simple enough strategy thus far, because the 'bridge join'
moment always coincided with the 'bridge port creation' moment.

But with sandwiched interfaces, such as:

 br0
  |
bond0
  |
 swp0

it may happen that the user has had time to change the bridge port flags
of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
assume that the bridge port flags are those determined by new_nbp, when
in fact this can happen:

ip link add br0 type bridge
ip link add bond0 type bond
ip link set bond0 master br0
ip link set bond0 type bridge_slave learning off
ip link set swp0 master br0

Now swp0 has learning enabled, bond0 has learning disabled. Not nice.

Fix this by "dumpster diving" through the actual bridge port flags with
br_port_flag_is_set, at bridge join time.

We use this opportunity to split dsa_port_change_brport_flags into two
distinct functions called dsa_port_inherit_brport_flags and
dsa_port_clear_brport_flags, now that the implementation for the two
cases is no longer similar.

Signed-off-by: Vladimir Oltean 
---
Changes in v3:
Rewrote dsa_port_clear_brport_flags to at least catch errors, and to use
the same "for" loop structure as dsa_port_inherit_brport_flags.

 net/dsa/port.c | 125 -
 1 file changed, 83 insertions(+), 42 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index fcbe5b1545b8..8dbc6e0db30c 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp)
rtnl_unlock();
 }
 
-static void dsa_port_change_brport_flags(struct dsa_port *dp,
-bool bridge_offload)
+static int dsa_port_inherit_brport_flags(struct dsa_port *dp,
+struct netlink_ext_ack *extack)
 {
-   struct switchdev_brport_flags flags;
-   int flag;
+   const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
+  BR_BCAST_FLOOD;
+   struct net_device *brport_dev = dsa_port_to_bridge_port(dp);
+   int flag, err;
 
-   flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
-   if (bridge_offload)
-   flags.val = flags.mask;
-   else
-   flags.val = flags.mask & ~BR_LEARNING;
+   for_each_set_bit(flag, , 32) {
+   struct switchdev_brport_flags flags = {0};
 
-   for_each_set_bit(flag, , 32) {
-   struct switchdev_brport_flags tmp;
+   flags.mask = BIT(flag);
 
-   tmp.val = flags.val & BIT(flag);
-   tmp.mask = BIT(flag);
+   if (br_port_flag_is_set(brport_dev, BIT(flag)))
+   flags.val = BIT(flag);
 
-   dsa_port_bridge_flags(dp, tmp, NULL);
+   err = dsa_port_bridge_flags(dp, flags, extack);
+   if (err && err != -EOPNOTSUPP)
+   return err;
}
+
+   return 0;
+}
+
+static void dsa_port_clear_brport_flags(struct dsa_port *dp,
+   struct netlink_ext_ack *extack)
+{
+   const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+   const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
+  BR_BCAST_FLOOD;
+   int flag, err;
+
+   for_each_set_bit(flag, , 32) {
+   struct switchdev_brport_flags flags = {0};
+
+   flags.mask = BIT(flag);
+   flags.val = val & BIT(flag);
+
+   err = dsa_port_bridge_flags(dp, flags, extack);
+   if (err && err != -EOPNOTSUPP)
+   dev_err(dp->ds->dev,
+   "failed to clear bridge port flag %d: %d 
(%pe)\n",
+   flags.val, err, ERR_PTR(err));
+   }
+}
+
+static int dsa_port_switchdev_sync(struct dsa_port *dp,
+  struct netlink_ext_ack *extack)
+{
+   int err;
+
+   err = dsa_port_inherit_brport_flags(dp, extack);
+   if (err)
+   return err;
+
+   return 0;
+}
+
+/* Configure the port for standalone mode (no address learning, flood
+ * everything, BR_STATE_FORWARDING, etc).
+ * The bridge only emits SWITCHDEV_ATTR_ID_PORT_* events when the user
+ * requests it through netlink or sysfs, but not automatically at port
+ * join or leave, so we need to handle resetting the brport flags ourselves.
+ * But we even prefer it that way, because otherwise, some setups might never
+ * get the notification they need, for example, when a port leaves a LAG that
+ * offloads the bridge, 

Re: [PATCH v4 18/22] x86/fpu/amx: Define AMX state components and have it used for boot-time checks

2021-03-20 Thread Thomas Gleixner
On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>  
> +static void check_xtile_data_against_struct(int size)
> +{
> + u32 max_palid, palid, state_size;
> + u32 eax, ebx, ecx, edx;
> + u16 max_tile;
> +
> + /*
> +  * Check the maximum palette id:
> +  *   eax: the highest numbered palette subleaf.
> +  */
> + cpuid_count(TILE_CPUID, 0, _palid, , , );
> +
> + /*
> +  * Cross-check each tile size and find the maximum
> +  * number of supported tiles.
> +  */
> + for (palid = 1, max_tile = 0; palid <= max_palid; palid++) {
> + u16 tile_size, max;
> +
> + /*
> +  * Check the tile size info:
> +  *   eax[31:16]:  bytes per title
> +  *   ebx[31:16]:  the max names (or max number of tiles)
> +  */
> + cpuid_count(TILE_CPUID, palid, , , , );
> + tile_size = eax >> 16;
> + max = ebx >> 16;
> +
> + if (WARN_ONCE(tile_size != sizeof(struct xtile_data),
> +   "%s: struct is %zu bytes, cpu xtile %d bytes\n",
> +   __stringify(XFEATURE_XTILE_DATA),
> +   sizeof(struct xtile_data), tile_size))
> + __xstate_dump_leaves();
> +
> + if (max > max_tile)
> + max_tile = max;
> + }
> +
> + state_size = sizeof(struct xtile_data) * max_tile;
> + if (WARN_ONCE(size != state_size,
> +   "%s: calculated size is %u bytes, cpu state %d bytes\n",
> +   __stringify(XFEATURE_XTILE_DATA), state_size, size))
> + __xstate_dump_leaves();

So we have 2 warnings which complain about inconsistent state and that's
it? Why has this absolutely no consequences? We just keep stuff enabled
and jug along, right?

Which one of the two states is correct? Why don't we just disable that
muck and be done with it to play it safe?

Failing to execute some workload by saying NO due to inconsistency is
far more useful than taking the chance of potential silent data
corruption.

Thanks,

tglx


Re: [RFC PATCH v2 net-next 07/16] net: dsa: sync ageing time when joining the bridge

2021-03-20 Thread Vladimir Oltean
On Fri, Mar 19, 2021 at 03:13:03PM -0700, Florian Fainelli wrote:
> > diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
> > index 86b5e05d3f21..3dafb6143cff 100644
> > --- a/net/bridge/br_stp.c
> > +++ b/net/bridge/br_stp.c
> > @@ -639,6 +639,19 @@ int br_set_ageing_time(struct net_bridge *br, clock_t 
> > ageing_time)
> > return 0;
> >  }
> >  
> > +clock_t br_get_ageing_time(struct net_device *br_dev)
> > +{
> > +   struct net_bridge *br;
> > +
> > +   if (!netif_is_bridge_master(br_dev))
> > +   return 0;
> > +
> > +   br = netdev_priv(br_dev);
> > +
> > +   return jiffies_to_clock_t(br->ageing_time);
> 
> Don't you want an ASSERT_RTNL() in this function as well?

Hmm, I'm not sure. I don't think I'm accessing anything that is under
the protection of the rtnl_mutex. If anything, the ageing time is
protected by the "bridge lock", but I don't think there's much of an
issue if I read an unsigned int while not holding it.


Re: [RFC PATCH v2 net-next 03/16] net: dsa: inherit the actual bridge port flags at join time

2021-03-20 Thread Vladimir Oltean
On Fri, Mar 19, 2021 at 03:08:46PM -0700, Florian Fainelli wrote:
> 
> 
> On 3/18/2021 4:18 PM, Vladimir Oltean wrote:
> > From: Vladimir Oltean 
> > 
> > DSA currently assumes that the bridge port starts off with this
> > constellation of bridge port flags:
> > 
> > - learning on
> > - unicast flooding on
> > - multicast flooding on
> > - broadcast flooding on
> > 
> > just by virtue of code copy-pasta from the bridge layer (new_nbp).
> > This was a simple enough strategy thus far, because the 'bridge join'
> > moment always coincided with the 'bridge port creation' moment.
> > 
> > But with sandwiched interfaces, such as:
> > 
> >  br0
> >   |
> > bond0
> >   |
> >  swp0
> > 
> > it may happen that the user has had time to change the bridge port flags
> > of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
> > assume that the bridge port flags are those determined by new_nbp, when
> > in fact this can happen:
> > 
> > ip link add br0 type bridge
> > ip link add bond0 type bond
> > ip link set bond0 master br0
> > ip link set bond0 type bridge_slave learning off
> > ip link set swp0 master br0
> > 
> > Now swp0 has learning enabled, bond0 has learning disabled. Not nice.
> > 
> > Fix this by "dumpster diving" through the actual bridge port flags with
> > br_port_flag_is_set, at bridge join time.
> > 
> > We use this opportunity to split dsa_port_change_brport_flags into two
> > distinct functions called dsa_port_inherit_brport_flags and
> > dsa_port_clear_brport_flags, now that the implementation for the two
> > cases is no longer similar.
> > 
> > Signed-off-by: Vladimir Oltean 
> > ---
> >  net/dsa/port.c | 123 -
> >  1 file changed, 82 insertions(+), 41 deletions(-)
> > 
> > diff --git a/net/dsa/port.c b/net/dsa/port.c
> > index fcbe5b1545b8..346c50467810 100644
> > --- a/net/dsa/port.c
> > +++ b/net/dsa/port.c
> > @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp)
> > rtnl_unlock();
> >  }
> >  
> > -static void dsa_port_change_brport_flags(struct dsa_port *dp,
> > -bool bridge_offload)
> > +static void dsa_port_clear_brport_flags(struct dsa_port *dp,
> > +   struct netlink_ext_ack *extack)
> >  {
> > struct switchdev_brport_flags flags;
> > -   int flag;
> >  
> > -   flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
> > -   if (bridge_offload)
> > -   flags.val = flags.mask;
> > -   else
> > -   flags.val = flags.mask & ~BR_LEARNING;
> > +   flags.mask = BR_LEARNING;
> > +   flags.val = 0;
> > +   dsa_port_bridge_flags(dp, flags, extack);
> 
> Would not you want to use the same for_each_set_bit() loop that
> dsa_port_change_br_flags() uses, that would be a tad more compact.
> -- 
> Florian

The reworded version has an equal number of lines, but at least it
catches errors now:

static void dsa_port_clear_brport_flags(struct dsa_port *dp,
struct netlink_ext_ack *extack)
{
const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD |
   BR_BCAST_FLOOD;
int flag, err;

for_each_set_bit(flag, , 32) {
struct switchdev_brport_flags flags = {0};

flags.mask = BIT(flag);
flags.val = val & BIT(flag);

err = dsa_port_bridge_flags(dp, flags, extack);
if (err && err != -EOPNOTSUPP)
dev_err(dp->ds->dev,
"failed to clear bridge port flag %d: %d 
(%pe)\n",
flag, err, ERR_PTR(err));
}
}


Re: [RFC PATCH v2 net-next 07/16] net: dsa: sync ageing time when joining the bridge

2021-03-19 Thread Florian Fainelli



On 3/18/2021 4:18 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean 
> 
> The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from:
> 
> sysfs/ioctl/netlink
> -> br_set_ageing_time
>-> __set_ageing_time
> 
> therefore not at bridge port creation time, so:
> (a) drivers had to hardcode the initial value for the address ageing time,
> because they didn't get any notification
> (b) that hardcoded value can be out of sync, if the user changes the
> ageing time before enslaving the port to the bridge
> 
> Signed-off-by: Vladimir Oltean 
> ---
>  include/linux/if_bridge.h |  6 ++
>  net/bridge/br_stp.c   | 13 +
>  net/dsa/port.c| 10 ++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 920d3a02cc68..ebd16495459c 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -137,6 +137,7 @@ struct net_device *br_fdb_find_port(const struct 
> net_device *br_dev,
>  void br_fdb_clear_offload(const struct net_device *dev, u16 vid);
>  bool br_port_flag_is_set(const struct net_device *dev, unsigned long flag);
>  u8 br_port_get_stp_state(const struct net_device *dev);
> +clock_t br_get_ageing_time(struct net_device *br_dev);
>  #else
>  static inline struct net_device *
>  br_fdb_find_port(const struct net_device *br_dev,
> @@ -160,6 +161,11 @@ static inline u8 br_port_get_stp_state(const struct 
> net_device *dev)
>  {
>   return BR_STATE_DISABLED;
>  }
> +
> +static inline clock_t br_get_ageing_time(struct net_device *br_dev)
> +{
> + return 0;
> +}
>  #endif
>  
>  #endif
> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
> index 86b5e05d3f21..3dafb6143cff 100644
> --- a/net/bridge/br_stp.c
> +++ b/net/bridge/br_stp.c
> @@ -639,6 +639,19 @@ int br_set_ageing_time(struct net_bridge *br, clock_t 
> ageing_time)
>   return 0;
>  }
>  
> +clock_t br_get_ageing_time(struct net_device *br_dev)
> +{
> + struct net_bridge *br;
> +
> + if (!netif_is_bridge_master(br_dev))
> + return 0;
> +
> + br = netdev_priv(br_dev);
> +
> + return jiffies_to_clock_t(br->ageing_time);

Don't you want an ASSERT_RTNL() in this function as well?
-- 
Florian


Re: [RFC PATCH v2 net-next 03/16] net: dsa: inherit the actual bridge port flags at join time

2021-03-19 Thread Florian Fainelli



On 3/18/2021 4:18 PM, Vladimir Oltean wrote:
> From: Vladimir Oltean 
> 
> DSA currently assumes that the bridge port starts off with this
> constellation of bridge port flags:
> 
> - learning on
> - unicast flooding on
> - multicast flooding on
> - broadcast flooding on
> 
> just by virtue of code copy-pasta from the bridge layer (new_nbp).
> This was a simple enough strategy thus far, because the 'bridge join'
> moment always coincided with the 'bridge port creation' moment.
> 
> But with sandwiched interfaces, such as:
> 
>  br0
>   |
> bond0
>   |
>  swp0
> 
> it may happen that the user has had time to change the bridge port flags
> of bond0 before enslaving swp0 to it. In that case, swp0 will falsely
> assume that the bridge port flags are those determined by new_nbp, when
> in fact this can happen:
> 
> ip link add br0 type bridge
> ip link add bond0 type bond
> ip link set bond0 master br0
> ip link set bond0 type bridge_slave learning off
> ip link set swp0 master br0
> 
> Now swp0 has learning enabled, bond0 has learning disabled. Not nice.
> 
> Fix this by "dumpster diving" through the actual bridge port flags with
> br_port_flag_is_set, at bridge join time.
> 
> We use this opportunity to split dsa_port_change_brport_flags into two
> distinct functions called dsa_port_inherit_brport_flags and
> dsa_port_clear_brport_flags, now that the implementation for the two
> cases is no longer similar.
> 
> Signed-off-by: Vladimir Oltean 
> ---
>  net/dsa/port.c | 123 -
>  1 file changed, 82 insertions(+), 41 deletions(-)
> 
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index fcbe5b1545b8..346c50467810 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -122,26 +122,82 @@ void dsa_port_disable(struct dsa_port *dp)
>   rtnl_unlock();
>  }
>  
> -static void dsa_port_change_brport_flags(struct dsa_port *dp,
> -  bool bridge_offload)
> +static void dsa_port_clear_brport_flags(struct dsa_port *dp,
> + struct netlink_ext_ack *extack)
>  {
>   struct switchdev_brport_flags flags;
> - int flag;
>  
> - flags.mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
> - if (bridge_offload)
> - flags.val = flags.mask;
> - else
> - flags.val = flags.mask & ~BR_LEARNING;
> + flags.mask = BR_LEARNING;
> + flags.val = 0;
> + dsa_port_bridge_flags(dp, flags, extack);

Would not you want to use the same for_each_set_bit() loop that
dsa_port_change_br_flags() uses, that would be a tad more compact.
-- 
Florian


<    1   2   3   4   5   6   7   8   9   10   >